Skip to content

Introduce Atomic Operations extension #1437

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Oct 1, 2020
Merged

Conversation

dgeb
Copy link
Member

@dgeb dgeb commented Oct 8, 2019

This is a proposal for an official extension to the JSON:API spec, as described in #1435. This proposal is based upon #1254, and supersedes that PR.

This extension provides a means to perform multiple "operations" in a linear and atomic manner. Operations are a serialized form of the mutations allowed in the base JSON:API specification. It uses the namespace atomic to emphasize the atomicity guarantee it provides.

@jugaadi
Copy link

jugaadi commented Oct 8, 2019

Doubts:

  1. In the base spec, it is given: A server MAY choose to stop processing as soon as a problem is encountered, or it MAY continue processing and encounter multiple problems.. How do we map the error objects with its corresponding operation object in the transaction? Or should we stop processing the transaction as soon as we encounter an error?
  2. How do we ensure idempotency wrt transaction? Can we pass a atomic:key to be used along with the operations object for identifying the transaction?
  3. Do we need a ref object when we already have a data object that clearly specifies the target resource and op property to provide the necessary context?
  4. In the context of operations,
    1. Does order matter when multiple operations are issued for a single resource? Ex. Relationship members are first deleted and then updated.
    2. Do we need multiple ways to handle an operation? For example.
      1. Add a new relationship or a new member.
        • To-One: op : update , data : null
        • To-Many: op: remove, data : [{...}]
      2. Update/Replace a relationship. Additionally, it can be also used to add or delete a relationship.
        • Through op: update wrt Resource
        • Through op: update wrt Resource Relationship
      3. Remove/Clear a relationship
        • To-One: op : update , data : null
        • To-Many: op: remove, data : [{...}]

Suggestions:

  1. As this extension dictates server processing wrt transactional semantics, we can consider using transaction as namespace instead of atomic.

@tobyzerner
Copy link
Contributor

tobyzerner commented Oct 9, 2019

@jugaadi

Re: point 1 - Error objects may contain a source member which could point to the operation in which the error occurred (or a specific path within the operation object). I'm assuming that once an error is encountered in an operation, no further operations should be processed, but I think the extension is ambiguous on this matter?

Re: point 3 - the data object alone is not sufficient to identify the target resource, as the type and the target resource collection are not necessarily the same. As demonstrated in the example for creating a resource, where the href is "/blogPosts" and the type is "articles". ref is also required to reference a specific relationship.

Re: point 4.i -

A server MUST perform operations in the order they appear in the
atomic:operations array.

So the end result would be the same as if multiple separate requests to the API were issued, but with the guarantee of atomicity.

@tobyzerner
Copy link
Contributor

Overall looks good to me, would be very happy to see this as an official extension and hopefully one day in the base spec. I may try implementing this soon so could give more feedback as a result of that.

Should the extension have anything to say about how an implementation should handle security? e.g. limiting the number of operations per request. I suppose not as it's in the same domain as rate-limiting which the base spec doesn't say anything about.

@Doqnach
Copy link

Doqnach commented Oct 9, 2019

Would this mean, this will also cover the case of wanting to POST several resources at once?

At first I thought it was fine to POST a {"data":[{...},{...},...]}, but the spec actually says you can only POST one resource at a time...

Personally I would prefer the above notation for such a case though instead of using an extension.

@gabesullice
Copy link
Contributor

gabesullice commented Oct 9, 2019

@jugaadi:

  1. I think an error object's /source/pointer member is sufficient to identify the operation object as the source of an error. Does that work in your opinion?
  2. The spec does not currently have any built-in idempotency mechanisms and I don't think atomic should introduce them. I would love to see an extension that explores Etag + If-Match inter alia to add more idempotency guarantees to the spec as a whole. It seems that it's an orthogonal concern that would benefit the spec in many areas.
  3. I think we do need the ref object. Consider the case of adding a relationship. The data member will be a resource identifier, not a resource object, thus the ref is necessary to communicate to which relationship the resource identifier object should be added. Also, consider the case of a remove operation (for resource objects and relationships).
  4. ...
    i. Order does matter, operations must be processed serially.
    ii. Sorry, I don't understand what you're asking or pointing out with these examples. Would you mind clarifying?

WRT to transaction vs atomic, I think @dgeb had a reason he preferred the latter. I'll let him respond.

@tobyzerner:

Thanks for your reply! I already wrote the above, before I saw this. You're right about /source/pointer! To answer your question, yes, the extension is intentionally silent about whether a server should continue processing or not. It's an implementation detail that is use-case specific I think. F.e. adding resource objects in bulk would probably benefit from it, but adding a resource object that is a target of a relationship and then adding the resource object that targets it would not.

Re: point 3, spot on!

Re: point 4, exactly :) Good spec reading!

Re: 200 vs 204, I believe the spec is this way for two reasons: 1) so that a client does not have to unnecessarily update its internal representation and 2) to save bytes.

I'm glad to hear you like the spec and are considering implementing it! That's great :)

TBH, I'm ambivalent about a security section. There's nothing normative we ought to say about it, but I know many specs do try to explicitly call out specific security considerations.

@Doqnach:

Yes. This allows you to create many resource objects in one request. I agree that your syntax would be simpler for the simple use case you describe, but there are many hidden edge cases. F.e. can those added resources reference one another? If so, must they be created in a specific order? If we support serial additions, why not serial mutations? This extension solves these complex cases while also solving the use case you mentioned, albeit with a little bit less syntactic sugar.

@jugaadi
Copy link

jugaadi commented Oct 10, 2019

@tobyzerner @gabesullice @dgeb Thanks for the clarifications.

Below are some of my concerns. Do correct me if Im wrong.

  1. Errors: Yes /source/pointer would solve the problem. However, I would still prefer atomic:results to reflect a successful as well as an error response so that http status is factored in the result object. Have included a format in the suggestion section.
  2. Idempotency: My assumption was Local IDs wrt resource objects could be reused to solve idempotency issues. However, in the context of operations, idempotency guarantees becomes a necessity because the same transaction if processed twice may result in inconsistent state. For example:
    • T1:
      1. Create b1
      2. Update a1 relationship: a1 - b1
    • T1:
      1. Create b1 # New b2 will be created.
      2. Update a1 relationship: a1 - b2
  3. Ref: My understanding was that href could be compulsorily used for all operations instead of ref. When combined with the basic data element and op element, the server should be able to derive the reference object.

UPDATING RESOURCES

{
  "atomic:operations": [{
    "op": "update",
    "href": "/articles/1",
    "type": "articles",
    "id": "1",
    "attributes": {
        "title": "To TDD or Not"
    }
  }]
}

UPDATING RELATIONSHIPS

{
  "atomic:operations": [{
    "op": "add",
    "href": "/articles/1/relationships/responses",
    "type": "articles",
    "id": "1",
    "relationship": {
        "responses": { 
            "data": [
                { "type": "comments", "id": "123" }
            ]
        }
    }
  }]
}
  1. Operations Order: As the operations are processed in the given order, the onus lies on the client to ensure that conflicting operations are ordered properly. I was wondering if the spec can guide on addressing intra request conflicts. For example, all operations must be processed in the following precedence:
    1. Remove
    2. Add
    3. Update

Suggestions

  1. Though it is understood that asynchronous operations in a single request cannot be supported, it would be better if it is explicitly mentioned in the spec.
  2. Unlike a database transaction, all operations are handled through a single HTTP request. As a result, it would be relatively easy for a server to guarantee ACID semantics for a given request. Moreover, from a client perspective, the base spec + this spec implicitly sets ACID as a default expectation. Hence, suggested "transaction" albeit a lightweight one.
  3. A transaction must additionally support asynchronous response as a server might relay it to multiple micro-services internally.
  4. A result object can follow a common format:
    { status: <http status code>, data: <response data> }. or { <errors object which contains status> }.

@gabesullice
Copy link
Contributor

gabesullice commented Oct 14, 2019

Responding to your concerns @jugaadi (also, thank you for taking the time to think so deeply about this!):

  1. I think @dgeb was trying very hard to disassociate operations from HTTP requests. An operations requests is not meant to be a substitute for multiple HTTP requests. Rather, an operations-capable resource can be seen as a single REST resource where interdependent entity mutations are made atomically through a single representation of a transaction. I.e., it's not a tunnel for HTTP requests. This is why status is not a part of the extension. If one operation in the unit transaction has a client error, the entire transaction has a client error. I think it would cause confusion about the atomicity guarantee if some operations had error codes and some didn't.

  2. Hmm, my feeling is that idempotence is something that ought to be solved more generally than just the atomic case. For example, there could be a extension called trace where a new, required member is added to the top-level document, like so:

{
  "trace:requestID": "{client-generated-uuid}",
  "atomic:operations": [{}, {}, "..."]
}

The server could trace requests and reject requests with IDs that it has already processed. This would work for atomic operations and also for POSTing individual resource objects. Am I missing something that makes atomic exceptional in this regard?

  1. I think you're understanding it correctly. IIRC, @dgeb wanted to have ref to simplify server implementations (because they wouldn't need to dereference a URL). Maybe I'm missing some additional nuance, so don't take that as 100% of the story.

  2. Ahh, I see. Yes, you're right, since operations are processed serially, that definitely places a burden on clients to do things in the correct order. I never thought about the fact that remove, add, update would always be correct when dealing with one of each operation type. That's cool! Unfortunately, it still doesn't help a client that adds 3 resources, a, b, & c with relationships between them: a <- b <- c. The client would still need to know to create a, then b then c.

We could give explicit guidance for clients like this:

All remove operations MUST precede any add or update operation(s). All add operations MUST precede any update operation(s). Additionally, operations which establish relationships to resource objects that are added in the same request MUST be preceded by the operations that added those related resource objects.

Just for context: Order was made a requirement to simplify server implementations. @dgeb and I felt that clients would generally have an easier time ordering operations than servers would since, in most cases, UI interactions will spawn operations. As long as those spawned operations are kept in order, they should automatically be processable in the order that they were spawned (even if it might be an inefficient order) since anything else would have been a nonsensical user flow.

@jugaadi
Copy link

jugaadi commented Oct 14, 2019

Thanks @gabesullice for the explanation.

  1. My understanding was based on the same premise that operation is a single resource and each individual operation is just like an attribute of the atomic:operations resource. Therefore, I just used status in the same vein as the status property in the error object i.e

When a server encounters multiple problems for a single request, the most generally applicable HTTP error code SHOULD be used in the response.
...
An error object MAY have the following members:
status : the HTTP status code applicable to this problem, expressed as a string value.

If it still gives the same impression, we can avoid it. We can go ahead with the current format.

  1. If idempotence is solved in a generic way, it would be awesome. The reason it matters in this extension is the following:

    1. As bulk(n) creation of resources is supported in this spec, 'm' similar requests may allow:
      1. Creation of duplicate(m x n) resources of each type in the server.
      2. Creation of incorrect relationships because of multiple requests.
    2. Decide whether there is a need to consider PUT for atomic:operations request?
    3. Decide whether idempotence should be part of the spec so that this extension can use it.
  2. That works for me. Just wanted to avoid another keyword and format. My assumption was that the current json:api server implementations would have already used urls(separate endpoint) to associate resources with related resources.

  3. I agree with your suggestion. Order is definitely important and arrays seem to be the best option when compared to JSON Pointers. The guidance should be good enough to resolve any ambiguity in implementation.

@jugaadi
Copy link

jugaadi commented Dec 2, 2019

Any updates?

Copy link

@auvipy auvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great!

@bart-degreed
Copy link
Contributor

In the base json:api spec, a server MUST return 201 No Content with no response document only for delete and relationship update. In all other cases, it MAY, allowing the server to return the same data as in the request. This can be useful in cases where it's expensive to determine what has changed.

I believe the equivalent for atomic:operations is: "the server MUST return a result with no data or, if all results are empty, the server MAY respond with 204 No Content and no document."

The cases where this rule applies are the same as in the base spec, except for one:

"If a server accepts an update and doesn’t update any attributes besides those provided, the server MUST return a result with no data or, if all results are empty, the server MAY respond with 204 No Content and no document."

Compared to the base spec:

"If an update is successful and the server doesn’t update any attributes besides those provided, the server MUST return either a 200 OK status code and response document (as described above) or a 204 No Content status code with no response document."

My conclusion is that the atomic:operations spec is more strict than the base spec for resource updates. Is this intentional or an oversight?

I would prefer this to be loosened to MAY, as in:
"If a server accepts an update and doesn’t update any attributes besides those provided, the server MAY return a result with no data..."

@gabesullice
Copy link
Contributor

@bart-degreed, great attention to detail! I do not see a problem with loosening the extension to allow the server to either respond with an empty result or "complete" result. The only thing that must be preserved is that the order and count of results must match the order and count of the requested operations.

@dgeb, am I missing some nuance that requires the extension to be more strict? The only reason that I can think of to choose the stricter wording in the extension is to allow the client to skip work if the result is empty.

@morvans
Copy link

morvans commented Feb 28, 2020

While updating my earlier implementation of the Operations proposal to the new Atomic Operations extension, I found that we do not clearly state if the "atomic:operations" array can be empty, in an operations request.

- `ref`: an object that **MAY** contain any of the following combinations of
members:

- `type` and `id`: to target an individual resource.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first sight, I think it may induce that both type & id were required, but of course it's not in case of an "add" operation. Should we be more specific ?

Suggested change
- `type` and `id`: to target an individual resource.
- `type` and `id`: to target an individual resource. `id` **MAY** be omitted in case of an "add" operation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with this refinement. What do you think @gabesullice ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I misunderstand the context here. After talking with @gabesullice we realized that ref is unnecessary in an add operation, which will already contain a type in the data member that represents the resource to be added.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we still need a ref to represents the "endpoint" to which the resource will be sent ?
If ref is considered unnecessary for add (we infer the endpoint from the resource itself), it also become unnecessary for update, right ?

From an implementer POV, I like the idea of just having to look at ref (or href) to determine to which endpoint pass the data which is treated like an opaque blob at this point.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, in this particular context of Atomic Operations, if ref is not necessary in an add operation, why should we have different endpoints when adding new resources through "plain" JSON:API calls ? A single /jsonapi/add would suffice.
But (in Drupal JSON:API implementation, at least), add operations are performed on different endpoints, based on the resource type, and then there's a check that the data's type (and also id, for in case of updates) matches the endpoint URL.
I agree that this can be perceived as a kind of "redundancy", but this is already the case with existing "regular" JSON:API, right ?
Let's say I build a JSON:API implementation where every write calls goes to a /jsonapi accepting PATCH or POST requests and inferring everything from the passed data, would it still be compliant with the original JSON:API spec ?

Copy link
Contributor

@gabesullice gabesullice Apr 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@morvans:

Let's say I build a JSON:API implementation where every write calls goes to a /jsonapi accepting PATCH or POST requests and inferring everything from the passed data, would it still be compliant with the original JSON:API spec?

Yes, I think it would be. Imagine something like calendar.mydomain.com/. Here, I could POST reminders and events. A GET request to / (root) might return a mixed collection of reminders and events in chronological order. Is it good API design? I think you could argue both yes and no, but the spec doesn't prevent it.


One of the use cases that I wanted to support by using the href member in an operation object was the concept of creating resource objects via editorialized endpoints. For example: adding both a tshirt and a sandal resource object to a /summer-styles endpoint.

I think that's sort of similar to what you're asking. You might be implementing a /operations endpoint where you want to add bothtshirt and sandal resource objects:

[{
  "op": "add",
  "ref": {
    "type": "tshirt"
  },
  "data": {
    "type": "tshirt",
    "attributes": {
      "sex": "male",
      "color": "blue"
    }
  }
}, {
  "op": "add",
  "ref": {
    "type": "sandal"
  },
  "data": {
    "type": "sandal",
    "attributes": {
      "sex": "unisex",
      "size": "gargantuan"
    }
  }
}]

That feels very redundant to me, honestly. However, you do say:

From an implementer POV, I like the idea of just having to look at ref (or href) to determine to which endpoint pass the data which is treated like an opaque blob at this point.

Knowing that you're implementing this within Drupal I'll take some liberties with my example...

I suspect you're talking about routing subrequests and I see why you'd want to treat the data member as an opaque blob, but you have to deserialize the entire JSON document anyway. So, it seems like the code for handling an omitted ref member below isn't too onerous or inelegant:

  $operation = $deserialized['atomic:operations'][0];
  $url = $operation['href'] ?? Url::fromRoute("jsonapi.{$operation['data']['type']}.collection.post")->toString();
  $subrequest = Request::create($url);

Since allowing ref to be omitted is elegant from the spec's perspective and there are still relatively elegant solutions for the server to handle that as well. I think it's okay. WDYT?

@dgeb
Copy link
Member Author

dgeb commented Mar 4, 2020

@bart-degreed as @gabesullice said, thanks for your attention to detail! I agree with your assessment.

I have loosened the language around updates to support the same level of strictness as the base spec. The section in question now reads:

If a server accepts an update and doesn’t update any fields besides those
provided, the server MUST return a result that includes either no data or
a representation of the resource as data or, if all results are empty, the
server MAY respond with 204 No Content and no document.

Note that I've changed attributes to fields here. I think this should make it back into the base spec since "attributes" has a very specific meaning in the spec, which was never intended here.

@dgeb
Copy link
Member Author

dgeb commented Mar 4, 2020

While updating my earlier implementation of the Operations proposal to the new Atomic Operations extension, I found that we do not clearly state if the "atomic:operations" array can be empty, in an operations request.

@morvans Thanks for raising this. It's an edge case that I have not considered, but will discuss with @gabesullice.

Possible update to consider:

- `atomic:operations` - an array of one or more [operation
  objects](#operation-objects).

- `atomic:results` - an array of one or more [result objects](#result-objects).

- `atomic:operations` - an array of one or more [operation
objects](#operation-objects).

- `atomic:results` - an array of one or more [result objects](#result-objects).
Copy link
Contributor

@gabesullice gabesullice Apr 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been writing a simple implementation of this and this flexibility is causing a bit of extra work. Would it make sense to heighten this requirement to:

In addition, such a document MUST include one of the following members, but not both:

while changing A document that supports this extension... above to A document using this extension...

I guess it comes down to a decision about whether we want to force servers to flexibly validate documents according to their declared extensions or whether we want clients to be able to be less precise about the extensions they're really expecting the server to process.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For context, here's a snippet of my code:

$supported_extensions = [
  'https://jsonapi.org/extensions/#atomic'
];
$unsupported_extensions = array_diff($extensions, $supported_extensions);
if (!empty($unsupported_extensions)) {
  throw new UnsupportedMediaTypeException(sprintf('The %s JSON:API media type extension is not supported', current($unsupported_extensions)));
}
if (in_array($supported_extensions[0], $extensions, TRUE)) {
  $request_document = Json::decode((string) $request->getContent());
  if (!isset($request_document['atomic:operations'])) {
    throw new BadRequestHttpException('Request documents using the %s JSON:API media type extension must include an `atomic:operations` top-level member.');
  }
}

It's that final throw new BadRequestException() that I'm concerned about. According to the language above, I think it's technically valid for a client to send a Content-Type header with the atomic extension URI, even when it's not including an atomic:operations member.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I guess it's not that bad. All it means is that I have to do:

$operations = isset($request_document['atomic:operations'])
  ? $request_document['atomic:operations']
  : [];

@gabesullice
Copy link
Contributor

Just gave this another light scan. A few thoughts:

  1. the extension should say that its own URI is https://jsonapi.org/ext/atomic.
  2. it's unclear whether or not operations other than add, update, and remove are permitted.
  3. the way that the "combinations" of allowed ref members is described is somewhat confusing. I think this needs some reorg and lid shouldn't be in parentheses (see general note below)
  4. s/can not/cannot: https://github.com/json-api/json-api/pull/1437/files#diff-3258e77d8c1edf3edfa2dda4b957adceR122

I think the Operation objects section needs to be rewritten. It feels clunky right now, probably because it grew organically as we tweaked and retweaked it. I don't think the actual structure of operation objects need to change, but I think that if we were to start that section with an empty canvas we could come up with a clearer way of describing the document structure.

@dgeb
Copy link
Member Author

dgeb commented Sep 16, 2020

@gabesullice those are all good points. I've just pushed commits that attempt to address them, although I'm certainly open to discussing and refining them further.

I think the Operation objects section needs to be rewritten. It feels clunky right now, probably because it grew organically as we tweaked and retweaked it.

Yes, I'd be open to collaborating on this 👍

@gabesullice
Copy link
Contributor

Thanks everyone for all the input on this! I think it's time to get this merged 🎉

@gabesullice gabesullice merged commit d9a964d into gh-pages Oct 1, 2020
@dgeb dgeb deleted the atomic-extension branch October 1, 2020 15:06
@MaxChong3188
Copy link

This is a proposal for an official extension to the JSON:API spec, as described in #1435. This proposal is based upon #1254, and supersedes that PR.

This extension provides a means to perform multiple "operations" in a linear and atomic manner. Operations are a serialized form of the mutations allowed in the base JSON:API specification. It uses the namespace atomic to emphasize the atomicity guarantee it provides.

nicolestandifer3 added a commit to nicolestandifer3/DotNet-Core-Json-Api that referenced this pull request Aug 6, 2023
…h an invalid content type. This change additionally allows extensions proposed at json-api/json-api#1437. Added test + fixes for running an endpoint that is not exposed through JsonApiDotNetCore (and we should not interfere)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy