GraphQL User guide#
This guide documents the basics of using Pyxis GraphQL, as well as using GraphQL in general. For optimizing your queries see partners.md.
If you’re looking for how to acquire access to Pyxis in the first place, see here.
GraphQL#
Because GraphQL is a relatively new language, this guide includes a short overview of it. While most users are fairly familiar with REST and might want to apply the same kind of thinking to GraphQL, this is not the best approach. GraphQL does not really have anything similar to REST endpoints, but rather defines types with fields that resemble objects. Let’s look at an example of such type for a user object:
type User {
id: ID
name: String
surname: String
age: Int
}
Four fields are defined here (id
, name
, surname
, age
),
each with a data type (ID
, String
and Int
).
You might be wondering about the ID
type, but by default,
it is just a String
alias. A GraphQL schema is composed of such types,
along with a special Query
type, that is used to define what can be queried.
Let’s define the Query
type:
type Query {
get_user(id: ID!): User
}
This defines a get_user
query that takes one argument named id
and returns a
User
type. The id
argument is required to be non-null by the !
type
modifier. All of the code examples mentioned so far would be a part
of the schema on the server side.
From the client side you’d be able to query a user with a id
of abc123
as
follows:
query Example{
get_user(id: "abc123"){
name
surname
}
}
Note how we are able to query for just the fields that we need
(name
and surname
in this case), which is the biggest difference from REST
APIs. If you’re making just one query, you can omit the query Example
part.
Another great feature is that you can do a bunch of queries in a single call:
{
get_user(id: "abc123"){
name
surname
}
get_user(id: "edf456"){
name
age
}
}
This concludes the short introduction to GraphQL to get you started, however, it is recommended that you take a look at some of the listed sources to learn more:
Quickstart#
In this section we’ll cover how we do GraphQL in Pyxis. Note that you should have some basic knowledge of GraphQL going into this section.
First, get your Pyxis GraphQL up and running, as described in the README.md. You should have http://localhost:8000/graphql/ showing you a GraphQL playground. On the right side you can browse schema definition and documentation.
We use GraphQLCRUD for query
naming. Queries starting with get_
fetch a single resource, while queries that
begin with find_
fetch paginated results.
Response format and error handling#
Unlike REST, GraphQL does not use HTTP status codes to communicate errors. Instead, applications are expected to communicate errors in the response. That’s why we use a unified format across Pyxis GraphQL:
{
data {
error {
status
detail
}
data {}
}
errors [{
message
}]
}
The top-level data
/errors
structure is provided by the graphql server. If errors
is not present, that means the top-level data
is valid. Inside that we provide
data
and error
for any errors that occurred during the resolve process.
So when fetching a resource, you’d first check if there are any errors in the
error
key, and if that field is null
, it would signal that the query was successful
and the data
field is valid. What type is in the data
is dependent on the query
you’re doing, more specifically on the type you’re querying for.
get_
queries#
Let’s look a single resource query first.
GraphQL query:
{
get_image(id: "abc") {
error {
status
detail
}
data {
_id
brew {
build
nvra
}
}
}
}
REST endpoint call equivalent:
https://catalog.redhat.com/api/containers/v1/images/id/abc?include=_id,brew.build,brew.nvra
Here we are fetching a single container Image with the id
of abc
and getting
its _id
and some nested brew
fields.
So, executing the query successfully should yield a similar response to this example:
{
"data": {
"get_image": {
"error": null,
"data": {
"_id": "abc",
"brew": {
"build": "package-1.1-11",
"nvra": "package-1.1-11.architecture"
}
}
}
}
}
find_
queries#
Find queries use the same format, but add some more parameters because of pagination.
Let’s look at the definition of the find_images
query:
find_images(page_size: Int = 50, page: Int = 0, sort_by: [sort_by!])
Notice that you can specify how many results per page you want, as well as the page,
just how you would in Pyxis REST, with the default 50 results per page, starting
at the first page. You can also specify sorting in a list of sort_by
types, which
consist of field name and either DESC
or ASC
for order.
Let’s see an example now.
GraphQL query:
{
find_images(page_size: 3, sort_by: [{ field: "creation_date", order: DESC }]) {
error {
status
detail
}
page
page_size
total
data {
_id
creation_date
}
}
}
REST endpoint call equivalent:
https://catalog.redhat.com/api/containers/v1/images?page_size=3&sort_by=creation_date[desc]&include=data._id,data.creation_date
In this query we request 3 images with the most recent creation_date
.
Because this result is paginated, we can also get page
, page_size
and total
number of results for the query. The result might look something like this:
{
"data": {
"find_images": {
"error": null,
"page": 0,
"page_size": 3,
"total": 607006,
"data": [
{
"_id": "A",
"creation_date": "2020-11-06T06:51:40.244000+00:00"
},
{
"_id": "B",
"creation_date": "2020-11-06T06:51:38.670000+00:00"
},
{
"_id": "C",
"creation_date": "2020-11-06T06:51:35.356000+00:00"
}
]
}
}
}
Advanced filtering#
For find_
queries, there’s complex filtering available in the filter
argument
for the query.
Let’s start with a simple query where we fetch only published repositories.
GraphQL query:
{
find_repositories (filter: {published: {eq: true}}){
error {
status
detail
}
page
page_size
total
data {
_id
}
}
}
REST endpoint call eqvivalent:
https://catalog.redhat.com/api/containers/v1/repositories?filter=published==true&include=data._id
The syntax for a filter item is {field: {condition: value}}
where available
conditions differ for each data type. GraphQL playground should provide autocomplete
with a list of available filters for every field, but generally we replicate the
functionality of the REST filtering language.
Chaining filters#
Let’s look at nested filters and chaining filters together, this time focusing only on the filter field to avoid repeating the rest of the query.
{
find_repositories(
filter: {
and: [
{ published: { eq: true } }
{ contacts: { email_address: { eq: "john@doe.com" } } }
]
}
)
}
REST filter equivalent:
filter=published==true and contacts.email_address=="john@doe.com"
At the top level (right after filter
), you can chain filters with and
or or
keywords, followed by a list of filters. One of those filters in the above query
also specifies a filter for a nested field contacts.email_address
.
At the top level you can also combine the and
and or
together like this:
{
find_repositories(
filter: {
and: [
{ published: { eq: true } }
{
or: [
{ contacts: { email_address: { eq: "bob@doe.com" } } }
{ contacts: { email_address: { eq: "alice@doe.com" } } }
]
}
]
}
)
}
REST filter equivalent:
filter=published==true and (contacts.email_address=="bob@doe.com" or contacts.email_address=="alice@doe.com")
The main takeaway from chaining filters is that it can only happen at the top level, e.g. this query is invalid:
{
find_repositories(
filter: {
and: [
{ published: { eq: true } }
{
contacts: {
or: [
{ email_address: { eq: "bob@doe.com" } }
{ email_address: { eq: "alice@doe.com" } }
]
}
}
]
}
)
}
Filtering arrays#
Arrays have their special filters, which you can use to filter by array size
(currently only eq
condition is available), by condition for a item
on a specified index or by elemMatch operator. If example_field
is an array, then you can filter
by its size with the example_field_size
field, by its index with
the example_field_index
field and by elemMatch with the example_field_elemMatch
.
Let’s try to filter images by the number of layers and specify what the top layer should be.
{
find_images(
filter: {
and: [
{ parsed_data: { layers_size: { eq: 1 } } }
{
parsed_data: {
layers_index: { index: 0, condition: { eq: "sha256:layer_hash" } }
}
}
]
}
)
}
REST filter equivalent:
filter=parsed_data.layers=size=1 and parsed_data.layers.0=="sha256:layer_hash"
The syntax for the _index
field is
{example_field_index: {index: Int condition: {cond: value}}}
.
The following query returns images that are in rhel
repository and are published
.
GraphQL query:
{
find_images(
filter: {
repositories_elemMatch: {
and: [
{repository: {eq: "rhel"}}
{published: {eq: true}}
]
}
}
) {
data {
_id
creation_date
repositories {
repository
published
}
}
}
}
REST endpoint call eqvivalent:
https://catalog.redhat.com/api/containers/v1/images?filter=repositories=em(repository=="rhel" and published==true)&include=data._id,data.creation_date,data.repositories.repository,data.repositories.published
Filtering subobjects and NULL#
Users can use filters to find objects where subobjects do or do not
exists or are set to NULL
. That can be done by the queries similar
to the one below.
GraphQL query:
{
find_images(
filter: {
repositories: {
eq: null
}
}
) {
data {
_id
creation_date
repositories {
repository
published
}
}
}
}
REST endpoint call eqvivalent:
https://catalog.redhat.com/api/containers/v1/images?filter=repositories==null&include=data._id,data.creation_date,data.repositories.repository,data.repositories.published
The query looks for the images where repositories is either null
or not
set at all. If we would want to find all the objects that have this field
set, we would replace eq
operator by ne
.
Edges#
Edges are another new thing that GraphQL introduces.
You can imagine schema types as nodes in a graph connected by edges.
Edges in this context mean references to other types in the schema.
So what are the fantastic edges and where to find them?
They can be identified as fields with *Response
data type in type definition
in the schema. This also indicates that another call to Pyxis REST has to be made,
so be careful about querying for too many edges,
as that might result in very slow queries.
Let’s look at a query utilizing an edge in the Image type to get vulnerabilities for the image.
GraphQL query:
{
get_image(id: "abc") {
error {
status
detail
}
data {
_id
edges {
vulnerabilities(page_size: 5) {
error {
status
detail
}
data {
_id
active
}
}
}
}
}
}
The GrapQL query would be equivalent to the following two REST endpoint calls:
https://catalog.redhat.com/api/containers/v1/images/id/abc?include=_id
https://catalog.redhat.com/api/containers/v1/images/id/abc/vulnerabilities?page_size=5&include=data._id,data.active
And the edge field definition in the schema:
type Image {
...
vulnerabilites(page_size: Int = 50, page: Int = 0, filter: VulnerabilityFilter): VulnerabilityListResponse
...
}
You can see that since this is a get many type of edge, you can control
pagination and even filter the results, but beware, you can’t affect the parent
type with these filters, nor you can use filter
argument on the parent
get_image
to filter edge fields.
There are also get one edges which contain only a single type,
e.g. the rpm_manifest
edge in Image
:
type Image {
...
rpm_manifest: RPMManifestResponse
...
}
In which case the query would look like this:
{
get_image(id: "abc") {
error {
status
detail
}
data {
_id
edges {
rpm_manifest {
error {
status
detail
}
data {
_id
}
}
}
}
}
}
The GrapQL query would be equivalent to the following two REST endpoint calls:
https://catalog.redhat.com/api/containers/v1/images/id/abc?include=_id
https://catalog.redhat.com/api/containers/v1/images/id/abc/rpm_manifest?include=_id
Variables#
Variables can be used to replace static query parameters and make queries reusable and more readable. When parameters are replaced by query variables, the query can be used repeatedly with different variable values. Variable values are sent as part of request payload in a dictionary saved under variables
key.
The variable must be specified in the query parameter by syntax $variableName: VariableType
. Then it can be referenced inside query by its name.
For example simple get_image
query can be transformed using variables as follows:
Old query:
{
get_image(id: "000000000000000000000123") {
error {
status
detail
}
data {
_id
}
}
}
Generalized query with variable:
query GetImageQuery($imageID: ObjectIDFilterScalar){
get_image(id: $imageID) {
error {
status
detail
}
data {
_id
}
}
}
Payload with variables:
{
"query": "...",
"variables": {
"imageID": "000000000000000000000123"
}
}
Aliases#
Aliases can be used to rename fields returned by a GraphQL query. Alias can be easily created using the alias_name: selected_field
syntax.
This can be useful for calling multiple queries with same name in one GraphQL request.
In this example, using aliases avoids an error that would be caused by a query name conflict:
{
image1: get_image(id: "000000000000000000000000") {
error {
status
detail
}
data {
_id
}
}
image2: get_image(id: "000000000000000000000001") {
error {
status
detail
}
data {
_id
}
}
}
Thanks to the aliases, the query returned a response with valid JSON without duplicate keys:
{
"data": {
"image1": {
"error": null,
"data": {
"_id": "000000000000000000000000"
}
},
"image2": {
"error": null,
"data": {
"_id": "000000000000000000000001"
}
}
}
}