The first "Start to Finish" guide that teaches you everything you need to know about building a project with Hygraph. Content, queries and so much more!
#Create a Project: Data structures, Data Migration and Presentation
Content projects are complex. Getting the technical pieces in play is only half of the game. The other half is actually defining a content model that will successfully represent your idea without limiting you in the future. While Hygraph makes many of the involved steps rather straight-forward, we’re going to do a complete project together from start to finish!
What we will do
At each step we will combine some manual (from the user interface) changes as well as programmatic changes, using the API directly.
- We will define our content model
- Migrate our data
- Display the content in an interactive way, looking at the various features of the the GraphQL language.
What you should know
While I will attempt to write in a way that non-technical people can follow along, if you want to adapt any of the code to your own use cases you will need to know the basics of HTML
, Ecmascript
, Node
, GraphQL
- but don’t worry! We’ll explain the parts as we go along!
What we will be creating
As someone with a bit of a foodie flair, I’m a big fan of the folks at Wine Folly. Not just because they kill it on the content department, but because they are from my home state! So, without asking (they say it’s ok for personal use? ?) we will recreate this poster, digitally.
Creating the Project
The first step is to create our project.
Let's look at our data! We have two types
to choose from when defining our schema. A model
represents complex data structures and an enumeration
represents a defined list of singular values (think something akin to a select box).
Let's unpack this.
Identifying the types
1 "Vegetables"
This seems to be a defined list. It is one of
Meat, Preparation, Dairy, etc.
- Food Category =>
Enumeration
2 "Alliums"
At first this would also seem to be a member of a defined list where there's a singular property. However, this is a bit misleading. Take a look at 1
and 5
? Our value assignment (pairing
, perfect pairing
) as well as vegetable
belong to this entry! It's a model!
This is helpful for us for additional reasons. Imagine if we wanted to do a little more with our content later? Right now we only borrow the category name, but what if we wanted to include a flag for known allergens such as on Nightshades? Now we have a fourth value to include.
Why not make Vegetables a model as well? At some point you have to draw a line. Technically we could add additional data to Vegetables as well, however, a helpful mental model to structure our method with is reducing the amount of fields the higher up the classification goes. Vegetable is the top classification in our case, so we'll limit it's property set. Keeping this as an enum also has benefits later which we'll see.
- Food Class =>
Model
3 "onion", "shallot", etc
This is a member of Alliums, amd there can be many members, so it's a Model. Models can only have a single entry from a selected Enumeration.
- Food =>
Model
4 & 8 "Bold Red"
I labeled this twice, but they are technically the same. Bold Red
is the top classification in our wine analogy, and so we would normally give it an Enumeration
. However, number 2
owns several instances of these classifications through the edges of pairing
and perfect pairing
. With an Enumeration you can have exactly one selection. Further, in my case, I want to track the pairings based on type, not by adding all wines of a type directly to the food pairing. That gives us more flexibility in the future. For example, adding a new white wine would only require choosing the correct Wine Class and it will be instantly connected throughout our graph.
- Wine Class =>
Model
5, 6 & 7
This becomes a bit more complex. 5
is actually a reference to 4
- which is our Wine Class, such as Bold Red
. The rating actually exists as a member of Food Class
. Written out, Nuts & Seeds
has a collection of Wine Classes
that are food pairings. It also has a collection of Wine Classes
that are perfect pairings.
Side note, a perfect pairing is a subjective (and arguably universal in many cases) judgement about how well the drink pairs.
In this case, I am going to add 6
and 7
as two fields onto our 2
model as a relationship comprised of 5
– our Wine Class
. In this case, 5
is a presentational artifact, something that appears as a default of the relationship between 2
, 4
, 6
and 7
. These will be "field definitions" on our Food Class model.
Tongue-tied yet? Keep going, it will become clearer when we actually create the models!
8 (See 4)
9 Malbec
Since this belongs to a category and has at least one additional field (the label), it's a Model
.
- Wine =>
Model
Identifying the fields
From our above analysis, we are left with 1 Enumeration
for the top level Food Category. Since they are simple lists of values, there's no fields to identify.
On our Models
, however, the fields are everything. After we broke everything down, we were left with 4 Models
. One for the Food entry, one for the Food Class, one for the Wine entry and one for the Wine Class.
Each of these need their own set of fields to help reconstruct our visual later and add the magical powers to our graph.
Wine Model
- Name =>
String
- Category =>
Enumeration
Wine Classification
- Name =>
String
- Wine =>
Relation to Wine Model
Food Classification
- Name =>
String
- Food Category =>
Enumeration
- Pairing =>
Relation to Wine Classification Model
- WF Perfect Pairing =>
Relation to Wine Classification Model
Food
- Name =>
String
- Food Classification =>
Relationship to Food Classification Model
And with that, our content analysis is complete! We can move on to defining the models and enumerations in our schema!
A Note about Display Names: In the future, we'll offer a field type that is simply a dropdown of strings which will allow better handling of display names. In my project I could also use all Models to be able to control my display names since Enumerations have very strict rules about their composition. For my purposes, I will format those to desired values on the client later on.
Defining the schema
We are going to edit our schema in two ways. Manually and with the API.
Here's how we create an enum with the UI.
Note: In an earlier iteration of this post, I had the wine category as an Enum. This video is not actually part of the tutorial any more, but is helpful to show how how you could create an enum manually.
Totally easy! But if we have several models to define or lots of options to add to our enums, it's by far easier to do this in code!
Enter, the API Playground
We need to get the stageId
of our project, so we'll use this query to accomplish that.
Note that we are on the Management API. The Management API allows us to change all aspect of the project, where as the Project API let's us change all aspects of the content.
Double Note: The management API is still being developed, it's possible there will be some breaking changes. We will update this post accordingly, but if things aren't working, check out the documentation.
{viewer {projects {namestages {nameid}}}}
Let's create our last enum with the following query and variable input:
Query
graphql
mutation createEnum($kinds: [String!]!) {
createEnumeration(data: {
stageId: "YOUR_STAGE_ID_FROM_THE_FIRST_QUERY",
apiId: "FoodCategory",
displayName: "Food Categories"
values: $kinds
}) {
values
}
}
Variables
json
{
"kinds": [
"Herbs_Spices",
"Dairy",
"Meat",
"Preparation",
"Starch",
"Sweet",
"Vegetables"
]
}
You can imagine the time savings this creates if you are provided with a large list of enum values that you otherwise have to copy and paste! Now it's time to move on to creating some models. Of course, this is quite easy in the GUI.
And, of course, it's also quite easy from the API as well. First we need to create a model and get it's ID on return.
mutation {createModel(data: {stageId:"YOUR_STAGE_ID",apiId: "Wine",displayName: "Wine"}) {model {id}}}
Using the id
from the previous query we can create the fields.
Mutation
graphql
mutation addFields($modelID: ID!) {
createStringField(data: {
modelId: $modelID,
apiId: "name",
displayName: "Name",
isRequired: true,
isUnique: true,
isList: false,
formConfig: {
renderer: "GCMS_SINGLE_LINE"
},
tableConfig: {
renderer: "GCMS_SINGLE_LINE"
}
}) {
field {
displayName
}
}
This part is no-longer relevant but is interesting to see for sake of reference!
createEnumerationField(data: {modelId: $modelID,enumerationId: "YOUR_ENUM_ID",apiId: "category",displayName: "Category",isRequired: true,isUnique: false,isList: false,formConfig: {renderer: "GCMS"},tableConfig: {renderer: "GCMS"}}) {field {displayName}}}
Variables
json
{
"modelID": "YOUR_MODEL_ID_FROM_PREVIOUS_MUTATION"
}
One of the selling points for Hygraph is that we abstract way some of the pain of writing these queries when you don't want to. In the case of adding fields, it's not actually easier to use the API. We'll only create one model with fields via the API.
This model, and most of the other models, will incorporate a Reference
- before we explain that, see if you can get a new model to look like this!
References are powerful in a graph database, they're what drive the magic connections that allow for highly detailed queries. You can read more about them in our documentation here.
In our case, we are going to create a one to many
relationship between our wine model and our wine classification model. The reason is that our wine classification can have many
wines, but for our current content needs, a wine can have only one
wine classification. Drag the reference field onto the wine model and arrange the settings like this:
Moving on we can create our food model, which will also need a relationship. Start by creating a model that looks like this:
And then add a relationship with these settings. Similar to wine, a food can have one
Food Class but a Food Class can have many
foods.
Now that we've gotten comfortable in our "Relationship" abilities. We're going to revisit our Food Class model and customize some relationship properties on there.
Our pairings and our perfect pairings are both relationships to the Wine Class model. In one example, a sub-selection of wine classes are pairs, and in the other a sub-selection of wine classes are perfect pairs. In this case, we also don't want the default API ID, we want to customize it to something we can reason about when looking at our data response from the server.
Add two more reference
fields to the Wine Class model, and match these configurations.
Pairings
Perfect Pairs
Here we have two relationships that point to the same underlying model but are customized in naming to help us identify their purpose later on.
Adding a second unique field to a model
Warning: Deep API usage coming up
By the time you are reading this, it's quite likely that adding extra unique fields to a model will be supported from the user interface. Until then, we will do this via the API Explorer. We will be adding a second field to track a difference between our display name and a second "slug" field. For example, if we wanted to create a navigable route in our action to Rosé, a type of wine, the slug will need to be stripped of any special characters to just Rose
.
Here's the mutation.
mutation {createStringField(data: {modelId: "YOUR_WINECLASS_MODEL_ID"isUnique: trueisRequired: falseisList: falseapiId: "slug"displayName: "Slug"formConfig: {renderer: "GCMS_SINGLE_LINE"},tableConfig: {renderer: "GCMS_SINGLE_LINE"}}) {field {displayName}}}
and we'll add a slug to our FoodClass model for the same reason.
mutation {createStringField(data: {modelId: "YOUR_FOODCLASS_MODEL_ID"isUnique: trueisRequired: falseisList: falseapiId: "slug"displayName: "Slug"formConfig: {renderer: "GCMS_SINGLE_LINE"},tableConfig: {renderer: "GCMS_SINGLE_LINE"}}) {field {displayName}}}
And with that, our content models are completed! Now we can move on to migrating data!
Importing the data
The next step will be writing our import script. Here's where some knowledge of Node.js will be helpful and/or required.
Creating our script project
Let's create a new script project. Create a new directory on your computer, navigate inside from the terminal and run the following series of commands;
# Init a project$ npm inif -f# Add dependencies$ yarn add csvtojson isomorphic-fetch# Create our script file$ touch importWine.js importWineClass.js importFood.js importFoodClass.js# Make a directory called data$ mkdir data# Download our data sets# Wine$ curl https://gist.githubusercontent.com/motleydev/689ac7b59fdecf5f70579e700ffb9524/raw/65db79a2b50a42716e2123334337f0058cc1372c/wine.csv > ./data/wine.csv# Food$ curl https://gist.githubusercontent.com/motleydev/0d32837213431fd7889d1f029ca58897/raw/9799c0e68f46fb05280d79fbb004777e89d5ca9e/food.csv > ./data/food.csv# Food Classes$ curl https://gist.githubusercontent.com/motleydev/709d3f8e1ea0aef6af67f613dfb8b635/raw/d3127a05520c72a89d498491ff078042b0936d8c/foodClasses.csv > ./data/foodClasses.csv
Creating a token for our API
By default, our api is closed to the public. We'll need to authenticate ourselves somehow. To do that, we can create a Permanent Auth Token (Yes, they can be deleted later!)
Here's a quick animation on how to do that.
Coding
Importing the Wine Classes
Now let's open up importWineClass.js
in a code editor. It's always a good idea to create the dependent pieces of data before bringing in your primary content model. In our case, both our data models are rather simple, and so we're going to jump right into the deep-end and show you how to import BOTH at once, and connect them! Hold on!
Here's the body of our script.
const csv = require('csvtojson')const fetch = require('isomorphic-fetch')// Our endpoint, which we can get from the project dashboardconst endpoint = "YOUR_API_ENDPOINT"// Typically you'd never want to put a token here// in plain text, but for our little script, it's ok.const token = "YOUR_TOKEN"// Our mutation to write data to our databaseconst mutation = `mutation CreateWineClass($kind: String,$wines: [WineCreateWithoutWineClassInput!]){createWineClass(data: {name: $kindslug: $kindwines: {create: $wines}}) {nameid}}`;// Our script to import the datacsv().fromFile('./data/wines.csv').then( wines => {// Sets allow us to force unique entries,// so we have just a set of our wine classes// from our larger data set.const wineClasses = new Set()// Pushing our wine classes into our set.for (const wine of wines) {wineClasses.add(wine.kind)}// The [...wineClasses] allows us to// convert the set to an array, so we// can use map, so we can make async// asynchronous calls, so, yah…const promises = [...wineClasses].map(async wineClass => {try {const formattedWine = ({kind: wineClass, // The Classwines: wines.filter(// Filter our wines by the current classwine => wine.kind == wineClass).map(// Format our data in a way our API likes,// see the video to explain how I figured that part out.wine => ({name: wine.value, status: 'PUBLISHED'}))})// The Fetch statement to send the data for eachconst resp = await fetch(endpoint, {headers: {'Content-Type': 'application/json','Authorization': `Bearer ${token}`},method: 'POST',body: JSON.stringify({query: mutation,variables: formattedWine})})// Parse the response to verify successconst body = await resp.json()const data = await body.dataconsole.log('Uploaded', data)return} catch (error) {console.log("Error!", error)}})Promise.all(promises).then(()=> console.log("Done"))})
The astute will notice that I've forgotten to add the PUBLISHED
status to my wine class. Well, my mistake is your win, here's an example query in the API Explorer that demonstrates how easy bulk updates are.
mutation {updateManyWineClasses(where: {status: DRAFT}, data: {status: PUBLISHED}) {count}}
The power of the API Explorer cannot be underestimated!
I want to explain the mutation statement from our create function above. There's a lot going on here, so I've recorded a short video to explain the parts.
Go back to your content window and see if we have our data. Let's do a test query in the API Explorer to make sur everything looks like we'd expect!
{wines {namewineClass {name}}}
And here's another query that will start to reveal some of the power behind the graph connections.
{wines(where: {wineClass: {name_in: ["Bold_red", "Medium_Red"]}}, orderBy: name_ASC) {name}}
Good job! Now we will repeat the process to import our food classifications. Here's what our foodClasses.csv
data set looks like.
classification, kind, pairing, wfppMeat, RED MEAT, Bold_Red;Medium_Red, Bold_RedMeat, CURED MEAT, Bold_Red;Medium_Red;Light_Red;Rose;Sparkling;Sweet_White;Dessert, Light_Red;Sweet_WhiteMeat, PORK, Bold_Red;Medium_Red;Rose;Sparkling, Medium_RedMeat, POULTRY, Medium_Red;Light_Red;Rose;Rich_White;Light_White;Sparkling, Light_Red;Rich_WhiteMeat, MOLLUSK, Light_White;Sparkling, Sparkling...
This would be quite typical for how we might get data from a third-party data source. I had the benefit of being able to hand craft the data, but tried to replicate standard csv export behavior.
Note: The author has done some additional work to ensure the integrity of the data, hopefully you won't have any issues, but since this was a hand generated data-set, some fat-finger issues were bound to happen!
Back to the scripting! Let's open up importFoodClass.js
and paste in this script.
const csv = require('csvtojson')const fetch = require('isomorphic-fetch')// Our endpoint, which we can get from the project dashboardconst endpoint = "YOUR_API_ENDPOINT"// Typically you'd never want to put a token here// in plain text, but for our little script, it's ok.const token = "YOUR_TOKEN"// Our mutation to write data to our databaseconst mutation = `mutation CreateFoodClass($kind: String!,$classification: FoodCategory!,$pairing: [WineClassWhereUniqueInput!],$wfpp: [WineClassWhereUniqueInput!],){createFoodClass(data: {name: $kind,slug: $kind,status: PUBLISHEDfoodCategory: $classification,wineClassWFPPs: {connect: $wfpp},wineClassPairings: {connect: $pairing}}) {name}}`;// Our script to import the datacsv().fromFile('./data/foodClasses.csv').then( foodClasses => {// Format our data and provide null value for missing data// in the API, a [Type!] can be null, but can't be an array// with null values. This would cause an import error.const createFormattedArray = arr =>arr.length >= 1? arr.split(';').map(item => ({slug: item})): nullconst promises = foodClasses.map(async foodClass => {// Parse our 'item;item;item' string into an array// with the proper shape for the API, which we find in// the api explorer.foodClass.pairing = createFormattedArray(foodClass.pairing)foodClass.wfpp = createFormattedArray(foodClass.wfpp)try {// The Fetch statement to send the data for eachconst resp = await fetch(endpoint, {headers: {'Content-Type': 'application/json','Authorization': `Bearer ${token}`},method: 'POST',body: JSON.stringify({query: mutation, variables: foodClass})})// Parse the response to verify successconst body = await resp.json()// I introduced an error catcher which wasn't in the previous// import scripts. It's a good idea.// As a spec rule, error will always be 'errors' which// is a one or more array of errors and some diagnostic data.// I just wanted the error message.if (body.errors) {console.log("Error", body.errors.map(error => error.message))}const data = await body.dataconsole.log('Uploaded', data)return} catch (error) {console.log("Error!", error)}})Promise.all(promises).then(()=> console.log("Done"))})
Let's test our imported data so far. We're going to write a query that combines multiple values.
{foodClasses {namewineClassWFPPs {namewines {name}}wineClassPairings {namewines {name}}}}
Hopefully you're starting to get excited! Look at all that ~wine data!
Now for our last import! Food!
Open importFood.js
and paste in this script:
const csv = require('csvtojson')const fetch = require('isomorphic-fetch')// Our endpoint, which we can get from the project dashboardconst endpoint = "YOUR_API_ENDPOINT"// Typically you'd never want to put a token here// in plain text, but for our little script, it's ok.const token = "YOUR_TOKEN"// Our mutation to write data to our databaseconst mutation = `mutation CreateFood($value:String!,$kind:String!) {createFood(data: {name: $value,status: PUBLISHED,foodClass: {connect: {slug: $kind}}}){name}}`;// Our script to import the datacsv().fromFile('./data/food.csv').then( foods => {const promises = foods.map(async food => {try {// The Fetch statement to send the data for eachconst resp = await fetch(endpoint, {headers: {'Content-Type': 'application/json','Authorization': `Bearer ${token}`},method: 'POST',body: JSON.stringify({query: mutation, variables: food})})// Parse the response to verify successconst body = await resp.json()// Catch Errorsif (body.errors) {console.log("Error", body.errors.map(error => error.message))}const data = await body.dataconsole.log('Uploaded', data)return} catch (error) {console.log("Error!", error)}})Promise.all(promises).then(()=> console.log("Done"))})
And our content is imported! Let's run one last query that will check if all our data is present and connected.
{wines {namewineClass {foodClassPairings {namefoods{name}}}}}
How cool is that?!
Now you can go through and clean up some of the imported data. You can either roll your own script like we've done above or you can use the interface.
With the exception of the WineClass model, the display name is correct for the visualization we will be creating. And with only 9 entries, we'll just clean those up quickly by hand.
Adding the design
Let's look at the chart again. What's the best way to ask for this data in a meaningful and structured way?
It appears that our Y Axis will be comprised of Food Classes and our X Axis will be comprised of Wine Classes. Where the two meet we have a variable for either No Pairing
, Pairing
or Perfect Pairing
.
So, for each Food Class I want the Wine Class pairings and the nested food. I want to group the Food Classes by Food Category. Here's how that Looks:
{foodClasses(orderBy: foodCategory_ASC) {namefoodCategoryfoods {name}wineClassPairings(orderBy: slug_ASC) {nameslugwines {namewineClass {name}}}wineClassWFPPs(orderBy: slug_ASC) {nameslugwines {namewineClass {name}}}}}
Because this is an inforgraphic, we are going to use the defacto D3 library. It's relatively straight forward and quite powerful. I'll be providing all code if you don't know D3, but the project itself will be heavily documented if you'd like to learn more about what's going on!
We will be using the awesome online service Observable HQ for our visualization. It allows us to put together code and text in a meaningful way without having to set up our own servers. Think of it as a code playground.
One thing to note right away is that the infographic was not programmatically generated. The data PRESENTATION is not structured according to any kind of easily derived logic. Wine Classes are sorted according to taste and Food Categories seem to be sorted more or less at random. The Food Classes themselves don't follow any logic except for perhaps the more common items are at the top of the list. To start with, we will use simple logic, so our order of the data will be misaligned with the provided image, but structurally it should look quite similar.
I've also manipulated the data provided through various transforms when I needed to. This allows us to work with a single data query. The reality is, if you are ok making a handful of extra queries, GraphQL is perfectly suited for data visualizations!
With D3, we have access to scale
functions that allow us to map data to defined outputs. As an example, I've mapped the discrete
input of our Food Classes (Red Meat
, Cured Meat
, etc) to the numeric range of 0 through ~1200. That means that if I pass a value of Alliums
into the scale function, I'd get a heigh readout roughly half way down the infographic! That's really cool!
The problem arises with our Food Category, such as Meat
. We need a way to tell the rectangle to start at a given height and end at a given height, but our data isn't formatted for that! What we need is a new shape of our data, an array of objects that are bucketed by Food Category (so we can iterate over our roughly 7 Categories) but with access to their "contained" Food Classes so we can use our scale function to fetch the first and last item in the respective grouping and their respective positions. Confused yet?
Here's how our master query from above provided the data:
"foodClasses": [{"name": "HARD CHEESE","foodCategory": "Dairy","foods": [Object, Object, Object, Object, Object],"wineClassPairings": [Object, Object, Object, Object, Object],"wineClassWFPPs": [Object],},{"name": "PUNGENT CHEESE","foodCategory": "Dairy","foods": [Object, Object, Object, Object],"wineClassPairings": [Object, Object, Object, Object, Object, Object, Object],"wineClassWFPPs": [Object, Object],}... ]
But for the labels I need data shaped more like this:
{"Dairy": [{"name": "HARD CHEESE"}{"name": "PUNGENT CHEESE"}... ]}
I used this function to format the data how I need: ```javascript foodCategoryBuckets = () => { // foodCategoryReduce is an already // reduced array of Food Category Strings const data = foodCategoryReduce.reduce( (result, item, index) => { result[item] = []; return result }, {})
// wineIsServed.foodClasses is the fetched master data for (let key of Object.keys(data)) { data[key] = wineIsServed.foodClasses.filter(d => d.foodCategory === key) }
return data } ```
I could have also just written another GraphQL query like this:
graphql
query getFoodPairing() {
Vegetables: foodClasses (
where:{
foodCategory: Vegetables
}) {
name
}
Dairy: foodClasses(
where: {
foodCategory: Dairy
}) {
name
}
Meat: foodClasses(
where: {
foodCategory: Meat
}) {
name
}
...
}
This uses a feature called "Aliases" which allow us to map the data to new keys on the response. The flexibility of GraphQL really shines here! Because the size of our data is relatively small, it was a better cost-benefit to simply modify my data on the client. For more complex data munging, I recommend looking at libraries like Crossfilter which are tailor made for these kinds of issues.
Open the API Gates!
The last thing we need to do is open up our API for public consumption. To do this, we need to enable the API for public read.
NOTE: You have limits on the number of READS for your project! If you turn this on, and share it with a bunch of friends, it's possible you'll run into rate limitation issues! For a small test, it's likely not a problem, but this is not a good production pattern!"
And with that, we can move on to the project! The remainder of this tutorial can be found at Observable HQ.
For reference, this is the core of what we'll be creating, plus some extras and few interactive charts as well!
You can find the whole project here.
See you over at Observable HQ!