18.3 C
New York
Saturday, April 5, 2025
Home Blog Page 3823

Implement knowledge high quality checks on Amazon Redshift knowledge property and combine with Amazon DataZone

0


Information high quality is essential in knowledge pipelines as a result of it straight impacts the validity of the enterprise insights derived from the info. At this time, many organizations use AWS Glue Information High quality to outline and implement knowledge high quality guidelines on their knowledge at relaxation and in transit. Nonetheless, some of the urgent challenges confronted by organizations is offering customers with visibility into the well being and reliability of their knowledge property. That is significantly essential within the context of enterprise knowledge catalogs utilizing Amazon DataZone, the place customers depend on the trustworthiness of the info for knowledgeable decision-making. As the info will get up to date and refreshed, there’s a danger of high quality degradation as a consequence of upstream processes.

Amazon DataZone is an information administration service designed to streamline knowledge discovery, knowledge cataloging, knowledge sharing, and governance. It permits your group to have a single safe knowledge hub the place everybody within the group can discover, entry, and collaborate on knowledge throughout AWS, on premises, and even third-party sources. It simplifies the info entry for analysts, engineers, and enterprise customers, permitting them to find, use, and share knowledge seamlessly. Information producers (knowledge homeowners) can add context and management entry by predefined approvals, offering safe and ruled knowledge sharing. The next diagram illustrates the Amazon DataZone high-level structure. To study extra in regards to the core parts of Amazon DataZone, consult with Amazon DataZone terminology and ideas.

Implement knowledge high quality checks on Amazon Redshift knowledge property and combine with Amazon DataZone

To deal with the problem of knowledge high quality, Amazon DataZone now integrates straight with AWS Glue Information High quality, permitting you to visualise knowledge high quality scores for AWS Glue Information Catalog property straight inside the Amazon DataZone net portal. You possibly can entry the insights about knowledge high quality scores on numerous key efficiency indicators (KPIs) corresponding to knowledge completeness, uniqueness, and accuracy.

By offering a complete view of the info high quality validation guidelines utilized on the info asset, you can also make knowledgeable choices in regards to the suitability of the precise knowledge property for his or her supposed use. Amazon DataZone additionally integrates historic tendencies of the info high quality runs of the asset, giving full visibility and indicating if the standard of the asset improved or degraded over time. With the Amazon DataZone APIs, knowledge homeowners can combine knowledge high quality guidelines from third-party methods into a particular knowledge asset. The next screenshot exhibits an instance of knowledge high quality insights embedded within the Amazon DataZone enterprise catalog. To study extra, see Amazon DataZone now integrates with AWS Glue Information High quality and exterior knowledge high quality options.

On this publish, we present find out how to seize the info high quality metrics for knowledge property produced in Amazon Redshift.

Amazon Redshift is a quick, scalable, and absolutely managed cloud knowledge warehouse that means that you can course of and run your complicated SQL analytics workloads on structured and semi-structured knowledge. Amazon DataZone natively helps knowledge sharing for Amazon Redshift knowledge property.

With Amazon DataZone, the info proprietor can straight import the technical metadata of a Redshift database desk and views to the Amazon DataZone venture’s stock. As these knowledge property will get imported into Amazon DataZone, it bypasses the AWS Glue Information Catalog, creating a spot in knowledge high quality integration. This publish proposes an answer to counterpoint the Amazon Redshift knowledge asset with knowledge high quality scores and KPI metrics.

Answer overview

The proposed resolution makes use of AWS Glue Studio to create a visible extract, rework, and cargo (ETL) pipeline for knowledge high quality validation and a customized visible rework to publish the info high quality outcomes to Amazon DataZone. The next screenshot illustrates this pipeline.

Glue ETL pipeline

The pipeline begins by establishing a connection on to Amazon Redshift after which applies mandatory knowledge high quality guidelines outlined in AWS Glue based mostly on the group’s enterprise wants. After making use of the principles, the pipeline validates the info towards these guidelines. The result of the principles is then pushed to Amazon DataZone utilizing a customized visible rework that implements Amazon DataZone APIs.

The customized visible rework within the knowledge pipeline makes the complicated logic of Python code reusable in order that knowledge engineers can encapsulate this module in their very own knowledge pipelines to publish the info high quality outcomes. The rework can be utilized independently of the supply knowledge being analyzed.

Every enterprise unit can use this resolution by retaining full autonomy in defining and making use of their very own knowledge high quality guidelines tailor-made to their particular area. These guidelines preserve the accuracy and integrity of their knowledge. The prebuilt customized rework acts as a central element for every of those enterprise models, the place they will reuse this module of their domain-specific pipelines, thereby simplifying the combination. To publish the domain-specific knowledge high quality outcomes utilizing a customized visible rework, every enterprise unit can merely reuse the code libraries and configure parameters corresponding to Amazon DataZone area, function to imagine, and title of the desk and schema in Amazon DataZone the place the info high quality outcomes should be posted.

Within the following sections, we stroll by the steps to publish the AWS Glue Information High quality rating and outcomes to your Redshift desk to Amazon DataZone.

Stipulations

To observe alongside, you must have the next:

The answer makes use of a customized visible rework to publish the info high quality scores from AWS Glue Studio. For extra data, consult with Create your individual reusable visible transforms for AWS Glue Studio.

A customized visible rework helps you to outline, reuse, and share business-specific ETL logic along with your groups. Every enterprise unit can apply their very own knowledge high quality checks related to their area and reuse the customized visible rework to push the info high quality outcome to Amazon DataZone and combine the info high quality metrics with their knowledge property. This eliminates the chance of inconsistencies that may come up when writing related logic in numerous code bases and helps obtain a quicker improvement cycle and improved effectivity.

For the customized rework to work, it’s worthwhile to add two information to an Amazon Easy Storage Service (Amazon S3) bucket in the identical AWS account the place you propose to run AWS Glue. Obtain the next information:

Copy these downloaded information to your AWS Glue property S3 bucket within the folder transforms (s3://aws-glue-assets-/transforms). By default, AWS Glue Studio will learn all JSON information from the transforms folder in the identical S3 bucket.

customtransform files

Within the following sections, we stroll you thru the steps of constructing an ETL pipeline for knowledge high quality validation utilizing AWS Glue Studio.

Create a brand new AWS Glue visible ETL job

You need to use AWS Glue for Spark to learn from and write to tables in Redshift databases. AWS Glue offers built-in assist for Amazon Redshift. On the AWS Glue console, select Creator and edit ETL jobs to create a brand new visible ETL job.

Set up an Amazon Redshift connection

Within the job pane, select Amazon Redshift because the supply. For Redshift connection, select the connection created as prerequisite, then specify the related schema and desk on which the info high quality checks should be utilized.

dqrulesonredshift

Apply knowledge high quality guidelines and validation checks on the supply

The following step is so as to add the Consider Information High quality node to your visible job editor. This node means that you can outline and apply domain-specific knowledge high quality guidelines related to your knowledge. After the principles are outlined, you may select to output the info high quality outcomes. The outcomes of those guidelines might be saved in an Amazon S3 location. You possibly can moreover select to publish the info high quality outcomes to Amazon CloudWatch and set alert notifications based mostly on the thresholds.

Preview knowledge high quality outcomes

Selecting the info high quality outcomes mechanically provides the brand new node ruleOutcomes. The preview of the info high quality outcomes from the ruleOutcomes node is illustrated within the following screenshot. The node outputs the info high quality outcomes, together with the outcomes of every rule and its failure cause.

previewdqresults

Publish the info high quality outcomes to Amazon DataZone

The output of the ruleOutcomes node is then handed to the customized visible rework. After each information are uploaded, the AWS Glue Studio visible editor mechanically lists the rework as talked about in post_dq_results_to_datazone.json (on this case, Datazone DQ Outcome Sink) among the many different transforms. Moreover, AWS Glue Studio will parse the JSON definition file to show the rework metadata corresponding to title, description, and listing of parameters. On this case, it lists parameters such because the function to imagine, area ID of the Amazon DataZone area, and desk and schema title of the info asset.

Fill within the parameters:

  • Position to imagine is elective and might be left empty; it’s solely wanted when your AWS Glue job runs in an related account
  • For Area ID, the ID to your Amazon DataZone area might be discovered within the Amazon DataZone portal by selecting the consumer profile title

datazone page

  • Desk title and Schema title are the identical ones you used when creating the Redshift supply rework
  • Information high quality ruleset title is the title you need to give to the ruleset in Amazon DataZone; you can have a number of rulesets for a similar desk
  • Max outcomes is the utmost variety of Amazon DataZone property you need the script to return in case a number of matches can be found for a similar desk and schema title

Edit the job particulars and within the job parameters, add the next key-value pair to import the fitting model of Boto3 containing the newest Amazon DataZone APIs:

--additional-python-modules

boto3>=1.34.105

Lastly, save and run the job.

dqrules post datazone

The implementation logic of inserting the info high quality values in Amazon DataZone is talked about within the publish Amazon DataZone now integrates with AWS Glue Information High quality and exterior knowledge high quality options . Within the post_dq_results_to_datazone.py script, we solely tailored the code to extract the metadata from the AWS Glue Consider Information High quality rework outcomes, and added strategies to search out the fitting DataZone asset based mostly on the desk data. You possibly can evaluate the code within the script in case you are curious.

After the AWS Glue ETL job run is full, you may navigate to the Amazon DataZone console and make sure that the info high quality data is now displayed on the related asset web page.

Conclusion

On this publish, we demonstrated how you should utilize the facility of AWS Glue Information High quality and Amazon DataZone to implement complete knowledge high quality monitoring in your Amazon Redshift knowledge property. By integrating these two companies, you may present knowledge shoppers with helpful insights into the standard and reliability of the info, fostering belief and enabling self-service knowledge discovery and extra knowledgeable decision-making throughout your group.

For those who’re trying to improve the info high quality of your Amazon Redshift atmosphere and enhance data-driven decision-making, we encourage you to discover the combination of AWS Glue Information High quality and Amazon DataZone, and the brand new preview for OpenLineage-compatible knowledge lineage visualization in Amazon DataZone. For extra data and detailed implementation steering, consult with the next assets:


In regards to the Authors

Fabrizio Napolitano is a Principal Specialist Options Architect for DB and Analytics. He has labored within the analytics area for the final 20 years, and has lately and fairly abruptly develop into a Hockey Dad after transferring to Canada.

Lakshmi Nair is a Senior Analytics Specialist Options Architect at AWS. She focuses on designing superior analytics methods throughout industries. She focuses on crafting cloud-based knowledge platforms, enabling real-time streaming, massive knowledge processing, and strong knowledge governance.

Varsha Velagapudi is a Senior Technical Product Supervisor with Amazon DataZone at AWS. She focuses on bettering knowledge discovery and curation required for knowledge analytics. She is keen about simplifying clients’ AI/ML and analytics journey to assist them succeed of their day-to-day duties. Outdoors of labor, she enjoys nature and out of doors actions, studying, and touring.

APIs, SBOMs, and Static Evaluation


As a part of an ongoing effort to maintain you knowledgeable about our newest work, this weblog submit summarizes some current publications from the SEI within the areas of software programming interfaces (APIs), software program payments of supplies (SBOMs), safe growth, Structure Evaluation and Design Language (AADL), and static evaluation.

These publications spotlight the newest work from SEI technologists in these areas. This submit features a itemizing of every publication, writer(s), and hyperlinks the place they are often accessed on the SEI web site.

Utility Programming Interface (API) Vulnerabilities and Dangers
by McKinley Sconiers-Hasan

Internet-accessible software programming interfaces (APIs) are more and more widespread, and they’re usually designed and carried out in a approach that creates safety dangers. Constructing on a taxonomy from OWASP, this report describes 11 widespread vulnerabilities and three dangers associated to APIs, offering solutions about the way to repair or scale back their impression. Suggestions embody utilizing a normal API documentation course of, utilizing automated testing, and guaranteeing the safety of the id and entry administration system.
Learn the SEI Particular Report.

Software program Invoice of Supplies (SBOM) Issues for Operational Take a look at & Analysis Actions
by Michael Bandor

This white paper seems at potential roles for SBOM inside numerous Operational Take a look at & Analysis (OT&E) actions. It seems on the historical past and background of SBOMs, current developments (as of the creation of the white paper), normal challenges and inquiries to ask, and 5 particular use instances. It concludes with conclusions and suggestions.

SBOMs are at present in early and ranging levels of adoption throughout business and inside the DoD. There are nonetheless points with the standard (e.g., completeness, accuracy, forex, and so on.) of the SBOMs being produced, in addition to adherence to the minimal important parts recognized by the U.S. Division of Commerce. Legacy techniques in addition to cloud-based techniques current challenges for producing SBOMs. The DoD is at present creating proposed steerage for addressing the SBOM requirement by applications.

Given this early part of adoption, it’s endorsed that SBOMs be used to reinforce however not exchange the present strategies utilized by Operational Take a look at (OT) personnel in efficiency of the testing features and to not rely solely on the SBOM data. The constraints are usually not intrinsic, and we will anticipate that SBOMs will show to be more and more important and helpful for OT actions.
Learn the SEI white paper.

Safe Methods Don’t Occur by Accident
by Timothy A. Chick

Most cybersecurity breaches are as a consequence of defects in design or code, together with each coding and logic errors. One of the simplest ways to deal with these challenges is to design and construct safer options. On this webcast, Tim Chick discusses how safety will be an integral facet of all the software program lifecycle. The important thing to success is to comply with deliberate engineering practices centered on lowering safety dangers by way of the usage of software program assurance strategies.

What attendees will be taught:

  • the significance of cybersecurity, together with examples of safety failures
  • qualities to have a look at when evaluating third-party software program
  • the connection between high quality and safety
  • engineering strategies used all through the event lifecycle to cut back cyber dangers

View the webcast.

Reachability of System Operation Modes in AADL
by Lutz Wrage

Parts in an AADL (Structure Evaluation and Design Language) mannequin can have modes that decide which subcomponents and connections are lively. Transitions between modes are triggered by occasions originating from the modeled system’s surroundings or from different elements within the mannequin. Modes and transitions can happen on any stage of the element hierarchy. The mixtures of element modes (known as system operation modes or SOMs) outline the system’s configurations. You will need to know which SOMs can truly happen within the system, particularly within the space of system security, as a result of a system might include elements that shouldn’t be lively concurrently, for instance, a automotive’s brake and accelerator. This report presents an algorithm that constructs the set of reachable SOMs for a given AADL mannequin and the transitions between them.
Learn the SEI Technical Report.

Automated Restore of Static Evaluation Alerts
by David Svoboda

Builders know that static evaluation helps make code safer. Nevertheless, heuristic static evaluation instruments usually produce numerous false positives, hindering their usefulness. On this podcast, David Svoboda, a software program safety engineer within the SEI’s CERT Division, discusses Redemption, a brand new open-source instrument from the SEI that mechanically repairs widespread errors in C/C++ code generated from static evaluation alerts, making code safer and static evaluation much less overwhelming.
Take heed to/view the podcast.

Navigating Functionality-Primarily based Planning: The Advantages, Challenges, and Implementation Necessities
by Anandi Hira and William Nichols

Functionality-based planning (CBP) defines a framework for acquisition and design that encompasses a complete view of current skills and future wants for the aim of supporting strategic choices concerning what is required and the way to successfully obtain it. Each enterprise and authorities acquisition domains use CBP for monetary success or to design well-balanced protection techniques. Unsurprisingly, the definitions differ throughout these domains. This paper endeavors to reconcile these definitions to supply a overarching view of CBP, its potential, and sensible implementation of its rules.
Learn the white paper.

My Story in Computing, with Sam Procter
by Sam Procter

Sam Procter, an SEI senior structure researcher, began out finding out pc science on the College of Nebraska, however he didn’t like it. It wasn’t till he took his first software program engineering course that he knew he’d discovered his profession path. On this SEI podcast, Procter discusses early influences that formed his profession, the significance of embracing several types of variety in his analysis and work, and the worth of work-life stability.
Take heed to/view the podcast.

Extra Sources

View the newest SEI analysis within the SEI Digital Library.
View the newest podcasts within the SEI Podcast Sequence.
View the newest installments within the SEI Webcast Sequence.

Iran’s Charming Kitten Targets US Elections, Israeli Army


A menace group linked to Iran’s Islamic Revolutionary Guard Corps (IRGC) has launched new cyberattacks towards electronic mail accounts related to the upcoming US presidential election in addition to high-profile army and different political targets in Israel. The exercise — which predominantly comes within the type of socially engineered phishing campaigns — are in retaliation for Israel’s ongoing army marketing campaign in Gaza and the US’ help for it, and are anticipated to proceed as tensions rise within the area.

Google’s Menace Evaluation Group (TAG) detected and blocked “quite a few” makes an attempt by Iran-backed APT42, maybe greatest often called Charming Kitten, to log in to the private electronic mail accounts of a couple of dozen people affiliated with President Biden and with former President Trump, in keeping with a weblog submit revealed yesterday. Targets of the exercise included present and former US authorities officers in addition to people related to the respective campaigns.

Furthermore, the menace group stays persistent in its ongoing efforts to try to compromise the private accounts of people affiliated with the present US Vice President and now presidential candidate Kamala Harris, and former President Trump, “together with present and former authorities officers and people related to the marketing campaign,” in keeping with the submit.

The invention comes as a Telegram-based bot service known as “IntelFetch” has additionally been discovered to be aggregating compromised credentials linked to the DNC and Democratic Social gathering web sites.

Charming Kitten Bats Round Israeli Targets

Along with election-related assaults, TAG researchers even have been monitoring numerous phishing campaigns towards Israeli army and political targets — together with folks with connections to the protection sector, in addition to diplomats, teachers, and NGOs — which have ramped up considerably since April, in keeping with the submit.

Google not too long ago took down a number of Google Websites pages created by the group “masquerading as a petition from the reputable Jewish Company for Israel calling on the Israeli authorities to enter into mediation to finish the battle,” in keeping with the submit.

Charming Kitten additionally abused Google Websites in an April phishing marketing campaign focusing on Israeli army, protection, diplomats, teachers, and civil society by sending emails that impersonated a journalist requesting touch upon current air strikes to focus on former senior Israeli army officers and an aerospace govt.

“During the last six months, we’ve got systematically disrupted these attackers’ means to abuse Google Websites in additional than 50 comparable campaigns,” in keeping with Google TAG.

One such marketing campaign concerned a phishing lure that featured an attacker-controlled Google Websites hyperlink that will direct the sufferer to a pretend Google Meet touchdown web page, whereas different lures included OneDrive, Dropbox, and Skype.

New & Ongoing APT42 Phishing Exercise

In different assaults, Charming Kitten has engaged in a various vary of social engineering ways in phishing campaigns that replicate its geopolitical stance. The exercise will not be more likely to let up for the forseeable future, in keeping with Google TAG.

A current marketing campaign towards Israeli diplomats, teachers, NGOs, and political entities got here from accounts hosted by quite a lot of electronic mail service suppliers, they found. Although the messages didn’t comprise malicious content material, Google TAG surmised that they had been “probably meant to elicit engagement from the recipients earlier than APT42 tried to compromise the targets,” and Google suspended Gmail accounts related to the APT.

A separate June marketing campaign focused Israeli NGOs utilizing a benign PDF electronic mail attachment impersonating a reputable political entity that contained a shortened URL hyperlink that redirected to a phishing package touchdown web page designed to reap Google login credentials. Certainly, APT42 usually makes use of phishing hyperlinks embedded both immediately within the physique of the e-mail or as a hyperlink in an in any other case innocuous PDF attachment, the researchers famous.

“In such circumstances, APT42 would interact their goal with a social engineering lure to set-up a video assembly after which hyperlink to a touchdown web page the place the goal was prompted to login and despatched to a phishing web page,” in keeping with the submit.

One other APT42 marketing campaign template is sending reputable PDF attachments as a part of a social engineering lure to construct belief and encourage the goal to have interaction on different platforms like Sign, Telegram, or WhatsApp, almost certainly as a strategy to ship a phishing package to reap credentials, in keeping with Google TAG.

Politically Motivated Assaults to Proceed

All of that is frequent looking for APT42/Charming Kitten, which is well-known for politically motivated cyberattacks. Of late, it has been extraordinarily lively towards Israel, the US, and different world targets since Israel’s army marketing campaign in Gaza in retaliation for the Hamas Oct. 7 assault in Israel.

Iran total has a lengthy historical past of responding to tensions within the area with cyberattacks towards Israel and the US. Up to now six months alone, the US and Israel accounted for roughly 60% of APT42’s identified geographic focusing on, in keeping with Google TAG. Extra exercise is predicted after the Israel’s current assassination of prime Hamas chief on Iranian soil, as specialists imagine that our on-line world will stay a main battleground for Iran-backed menace actors.

“APT42 is a classy, persistent menace actor and so they present no indicators of stopping their makes an attempt to focus on customers and deploy novel ways,” in keeping with Google TAG. “As hostilities between Iran and Israel intensify, we will count on to see elevated campaigns there from APT42.”

The researchers additionally included an inventory of indicators of compromise (IoCs) in its submit that embrace domains and IP addresses identified for use by APT42. Organizations who could also be focused additionally ought to stay vigilant for the assorted social engineering and phishing ways utilized by the group in its not too long ago found menace campaigns.



Check-Driving HTML Templates


foo

Let’s examine the best way to do it in phases: we begin with the next take a look at that
tries to compile the template. In Go we use the usual html/template package deal.

Go

  func Test_wellFormedHtml(t *testing.T) {
    templ := template.Should(template.ParseFiles("index.tmpl"))
    _ = templ
  }

In Java, we use jmustache
as a result of it is quite simple to make use of; Freemarker or
Velocity are different frequent selections.

Java

  @Check
  void indexIsSoundHtml() {
      var template = Mustache.compiler().compile(
              new InputStreamReader(
                      getClass().getResourceAsStream("/index.tmpl")));
  }

If we run this take a look at, it’ll fail, as a result of the index.tmpl file does
not exist. So we create it, with the above damaged HTML. Now the take a look at ought to go.

Then we create a mannequin for the template to make use of. The appliance manages a todo-list, and
we will create a minimal mannequin for demonstration functions.

Go

  func Test_wellFormedHtml(t *testing.T) {
    templ := template.Should(template.ParseFiles("index.tmpl"))
    mannequin := todo.NewList()
    _ = templ
    _ = mannequin
  }

Java

  @Check
  void indexIsSoundHtml() {
      var template = Mustache.compiler().compile(
              new InputStreamReader(
                      getClass().getResourceAsStream("/index.tmpl")));
      var mannequin = new TodoList();
  }

Now we render the template, saving the ends in a bytes buffer (Go) or as a String (Java).

Go

  func Test_wellFormedHtml(t *testing.T) {
    templ := template.Should(template.ParseFiles("index.tmpl"))
    mannequin := todo.NewList()
    var buf bytes.Buffer
    err := templ.Execute(&buf, mannequin)
    if err != nil {
      panic(err)
    }
  }

Java

  @Check
  void indexIsSoundHtml() {
      var template = Mustache.compiler().compile(
              new InputStreamReader(
                      getClass().getResourceAsStream("/index.tmpl")));
      var mannequin = new TodoList();
  
      var html = template.execute(mannequin);
  }

At this level, we need to parse the HTML and we anticipate to see an
error, as a result of in our damaged HTML there’s a div factor that
is closed by a p factor. There’s an HTML parser within the Go
normal library, however it’s too lenient: if we run it on our damaged HTML, we do not get an
error. Fortunately, the Go normal library additionally has an XML parser that may be
configured to parse HTML (due to this Stack Overflow reply)

Go

  func Test_wellFormedHtml(t *testing.T) {
    templ := template.Should(template.ParseFiles("index.tmpl"))
    mannequin := todo.NewList()
    
    // render the template right into a buffer
    var buf bytes.Buffer
    err := templ.Execute(&buf, mannequin)
    if err != nil {
      panic(err)
    }
  
    // examine that the template might be parsed as (lenient) XML
    decoder := xml.NewDecoder(bytes.NewReader(buf.Bytes()))
    decoder.Strict = false
    decoder.AutoClose = xml.HTMLAutoClose
    decoder.Entity = xml.HTMLEntity
    for {
      _, err := decoder.Token()
      swap err {
      case io.EOF:
        return // We're performed, it is legitimate!
      case nil:
        // do nothing
      default:
        t.Fatalf("Error parsing html: %s", err)
      }
    }
  }

supply

This code configures the HTML parser to have the correct stage of leniency
for HTML, after which parses the HTML token by token. Certainly, we see the error
message we wished:

--- FAIL: Test_wellFormedHtml (0.00s)
    index_template_test.go:61: Error parsing html: XML syntax error on line 4: surprising finish factor 

In Java, a flexible library to make use of is jsoup:

Java

  @Check
  void indexIsSoundHtml() {
      var template = Mustache.compiler().compile(
              new InputStreamReader(
                      getClass().getResourceAsStream("/index.tmpl")));
      var mannequin = new TodoList();
  
      var html = template.execute(mannequin);
  
      var parser = Parser.htmlParser().setTrackErrors(10);
      Jsoup.parse(html, "", parser);
      assertThat(parser.getErrors()).isEmpty();
  }

supply

And we see it fail:

java.lang.AssertionError: 
Anticipating empty however was:<[<1:13>: Unexpected EndTag token [] when in state [InBody],

Success! Now if we copy over the contents of the TodoMVC
template
to our index.tmpl file, the take a look at passes.

The take a look at, nevertheless, is simply too verbose: we extract two helper capabilities, in
order to make the intention of the take a look at clearer, and we get

Go

  func Test_wellFormedHtml(t *testing.T) {
    mannequin := todo.NewList()
  
    buf := renderTemplate("index.tmpl", mannequin)
  
    assertWellFormedHtml(t, buf)
  }

supply

Java

  @Check
  void indexIsSoundHtml() {
      var mannequin = new TodoList();
  
      var html = renderTemplate("/index.tmpl", mannequin);
  
      assertSoundHtml(html);
  }

supply

Stage 2: testing HTML construction

What else ought to we take a look at?

We all know that the seems to be of a web page can solely be examined, finally, by a
human how it’s rendered in a browser. Nonetheless, there may be usually
logic in templates, and we would like to have the ability to take a look at that logic.

One is perhaps tempted to check the rendered HTML with string equality,
however this method fails in apply, as a result of templates comprise a number of
particulars that make string equality assertions impractical. The assertions
grow to be very verbose, and when studying the assertion, it turns into tough
to grasp what it’s that we’re attempting to show.

What we’d like
is a way to say that some components of the rendered HTML
correspond to what we anticipate, and to ignore all the small print we do not
care about.
A method to do that is by operating queries with the CSS selector language:
it’s a highly effective language that enables us to pick out the
components that we care about from the entire HTML doc. As soon as we now have
chosen these components, we (1) depend that the variety of factor returned
is what we anticipate, and (2) that they comprise the textual content or different content material
that we anticipate.

The UI that we’re speculated to generate seems to be like this:

Check-Driving HTML Templates

There are a number of particulars which can be rendered dynamically:

  1. The variety of gadgets and their textual content content material change, clearly
  2. The model of the todo-item adjustments when it is accomplished (e.g., the
    second)
  3. The “2 gadgets left” textual content will change with the variety of non-completed
    gadgets
  4. One of many three buttons “All”, “Energetic”, “Accomplished” shall be
    highlighted, relying on the present url; for example if we determine that the
    url that reveals solely the “Energetic” gadgets is /energetic, then when the present url
    is /energetic, the “Energetic” button ought to be surrounded by a skinny pink
    rectangle
  5. The “Clear accomplished” button ought to solely be seen if any merchandise is
    accomplished

Every of this considerations might be examined with the assistance of CSS selectors.

It is a snippet from the TodoMVC template (barely simplified). I
haven’t but added the dynamic bits, so what we see right here is static
content material, offered for instance:

index.tmpl

  

supply

This uncommon earth steel exhibits us the way forward for our planet’s assets


Demand for neodymium-based magnets may outstrip provide within the coming decade. The longer-term prospects for the steel’s provide aren’t as dire, however a cautious have a look at neodymium’s potential future reveals lots of the challenges we’ll doubtless face throughout the availability chain for supplies within the coming century and past. 

Peak panic

Earlier than we get into our materials future, it’s vital to level out simply how onerous it’s all the time been to make correct predictions of this type. Simply have a look at our steady theorizing concerning the provide of fossil fuels. 

One model of the story, instructed continuously in economics lessons, goes one thing like this: Provided that there’s a restricted provide of oil, sooner or later the world will run out of it. Earlier than then, we should always attain some most quantity of oil extraction, after which manufacturing will begin an irreversible decline. That top level is named “peak oil.”

This concept has been traced again so far as the early 1900s, however some of the well-known analyses got here from M. King Hubbert, who was a geologist at Shell. In a 1956 paper, Hubbert thought-about the full quantity of oil (and different fossil fuels, like coal and pure gasoline) that geologists had recognized on the planet. From the estimated provide and the quantity the world had burned by means of, he predicted that oil manufacturing within the US would peak and start declining between 1965 and 1970. The height of world oil manufacturing, he predicted, would come a bit later, in 2000. 

For some time, it regarded as if Hubbert was proper. US oil manufacturing elevated till 1970, when it reached a dramatic peak. It then declined for many years afterward, till about 2010. However then advances in drilling and fracking strategies unlocked hard-to-reach reserves. Oil manufacturing skyrocketed within the US by means of the 2010s, and as of 2023, the nation was producing extra oil than ever earlier than

Peak-oil panic has lengthy outlived Hubbert, however each time economists and geologists have predicted that we’ve reached, or are about to succeed in, the height of oil manufacturing, they’ve missed the mark (to date).