Again in 2019 I informed you about AWS Knowledge Trade and confirmed you how one can Discover, Subscribe To, and Use Knowledge Merchandise. As we speak, you possibly can select from over 3600 knowledge merchandise in ten classes:
In my introductory put up I confirmed you the way might subscribe to knowledge merchandise after which obtain the info units into an Amazon Easy Storage Service (Amazon S3) bucket. I then instructed numerous choices for additional processing, together with AWS Lambda features, a AWS Glue crawler, or an Amazon Athena question.
As we speak we’re making it even simpler so that you can discover, subscribe to, and use third-party knowledge with the introduction of AWS Knowledge Trade for Amazon Redshift. As a subscriber, you possibly can straight use knowledge from suppliers with none additional processing, and no want for an Extract Remodel Load (ETL) course of. Since you don’t should do any processing, the info is at all times present and can be utilized straight in your Amazon Redshift queries. AWS Knowledge Trade for Amazon Redshift takes care of managing all entitlements and funds for you, with all prices billed to your AWS account.
As a supplier, you now have a brand new strategy to license your knowledge and make it obtainable to your clients.
As I used to be scripting this put up, it was cool to comprehend simply what number of current features of Redshift, and Knowledge Trade performed central roles. As a result of Redshift has a clear separation of storage and compute, together with built-in knowledge sharing options, the info supplier allocates and pays for storage, and the info subscriber does the identical for compute. The supplier doesn’t have to scale their cluster in proportion to the dimensions of their subscriber base, and may deal with buying and offering knowledge.
Let’s check out this characteristic from two vantage factors: subscribing to an information product, and publishing an information product.
AWS Knowledge Trade for Amazon Redshift – Subscribing to a Knowledge Product
As an information subscriber I can flick through the AWS Knowledge Trade catalog and discover knowledge merchandise which are related to my enterprise, and subscribe to them.
Knowledge suppliers can even create personal gives and lengthen them to me for entry by way of the AWS Knowledge Trade Console. I click on My product gives, and overview the gives which were prolonged to me. I click on on Proceed to subscribe to proceed:
Then I full my subscription by reviewing the provide and the subscription phrases, noting the info units that I’ll get, and clicking Subscribe:
As soon as the subscription is accomplished, I’m notified and may transfer ahead:
From the Redshift Console, I click on Datashares, choose Subscriptions, and I can see the subscribed knowledge set:
Subsequent, I affiliate it with a number of of my Redshift clusters by making a database that factors to the subscribed datashare, and use the tables, views, and saved procedures to energy my Redshift queries and my functions.
AWS Knowledge Trade for Amazon Redshift – Publishing a Knowledge Product
As an information supplier I can embrace Redshift tables, views, schemas and user-defined features in my AWS Knowledge Trade product. To maintain issues easy, I’ll create a product that features only one Redshift desk.
I exploit the spiffy new Redshift Question Editor V2 to create a desk that maps US space codes to a metropolis and a state:
Then I study the checklist of current datashares for my Redshift cluster, and click on Create datashare to make a brand new one:
Subsequent, I am going by means of the same old course of for making a datashare. I choose AWS Knowledge Trade datashare, assign a reputation (area_code_reference), choose the database inside the cluster, and make the datashare accessible to publicly accessible clusters:
Then I scroll down and click on Add to maneuver ahead:
I select my schema (public), choose to incorporate solely tables and views in my datashare, after which add the area_codes desk:
At this level I can click on Add to wrap up, or Add and repeat to make a extra complicated product that comprises further objects.
I affirm that the datashare comprises the desk, and click on Create datashare to maneuver ahead:
Now I’m prepared to begin publishing my knowledge! I go to the AWS Knowledge Trade Console, develop the navigation on the left, and click on Owned knowledge units:
I overview the Knowledge set creation steps, and click on Create knowledge set to proceed:
I choose Amazon Redshift datashare, give my knowledge set a reputation (United States Space Codes), enter an outline, and click on Create knowledge set to proceed:
I create a revision known as v1:
I choose my datashare and click on Add datashare(s):
Then I finalize the revision:
I confirmed you how one can create a datashare and a dataset, and to publish a product utilizing the console. If you’re publishing a number of merchandise and/or making common revisions, you possibly can automate all of those steps utilizing the AWS Command Line Interface (CLI) and the Amazon Knowledge Trade APIs.
Preliminary Knowledge Merchandise
A number of knowledge suppliers are working to make their knowledge merchandise obtainable to you thru AWS Knowledge Trade for Amazon Redshift. Listed below are a few of the preliminary choices and the official descriptions:
- FactSet Provide Chain Relationships – FactSet Revere Provide Chain Relationships knowledge is constructed to reveal enterprise relationship interconnections amongst firms globally. This feed offers entry to the complicated networks of firms’ key clients, suppliers, opponents, and strategic companions, collected from annual filings, investor displays, and press releases.
- Foursquare Locations 2021: New York Metropolis Pattern – This trial dataset comprises Foursquare’ss built-in Locations (POI) database for New York Metropolis, accessible as a Redshift Knowledge Share. Immediately load Foursquare’s Locations knowledge in to a Redshift desk for additional processing and evaluation. Foursquare knowledge is privacy-compliant, uniquely sourced, and trusted by high enterprises like Uber, Samsung, and Apple.
- Mathematica Medicare Pilot Dataset – Mixture Medicare HCC counts and prevalence by state, county, payer, and filtered to the diabetic inhabitants from 2017 to 2019.
- COVID-19 Vaccination in Canada – This itemizing comprises pattern datasets for COVID-19 Vaccination in Canada knowledge.
- Revelio Labs Workforce Composition and Tendencies Knowledge (Trial knowledge) – Perceive the workforce composition and traits of any firm.
- Facteus – US Card Shopper Cost – CPG Backtest – Historic pattern from panel of SKU-level transaction element from money and card transactions throughout a whole bunch of Shopper-Packaged Items offered at over 9,000 city comfort shops and bodegas throughout the U.S.
- Decadata Argo Provide Chain Trial Knowledge – Provide chain knowledge for CPG corporations delivering merchandise to US Grocery Retailers.