Skip to content

Implemented some experimental event loops with nextTick() support. #234

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 24 commits into from
Dec 5, 2013
Merged

Implemented some experimental event loops with nextTick() support. #234

merged 24 commits into from
Dec 5, 2013

Conversation

jmalloc
Copy link
Contributor

@jmalloc jmalloc commented Nov 13, 2013

I fully expect that this PR will be rejected in its current form, but I needed a place to start a discussion about it.

This pull-request attempts to provide a reliable way to register callbacks on the event-loop that are executed with the following guarantees:

On each tick of the engine:

  • the callbacks are executed in the same order as they are enqueued
  • the callbacks are executed before any pending timer or IO events
  • callbacks registered within an existing callback will also execute before IO events

As best I can tell this matches the current behaviour of process.nextTick() in nodejs (0.10.x), and as such I have dared to call the feature 'nextTick'. Please correct any misconceptions I have about this :) (see #92)

There is one new interface:

  • NextTickLoopInterface - adds the nextTick() method to the existing LoopInterface

And two implementations:

Both implementations are tested against the existing abstract test cases for loops and timers.

I haven't had a chance to try libev or libuv yet, but I will attempt to provide implementations based on both of those libraries if this PR is well received.

I chose not to completely replace the existing stream_select() based loop at this stage, but there is no reason it would need to remain if the implementation proposed in this PR is accepted in the future.

Finally, I also added some doc-blocks to LoopInterface to clarify the guarantees made in regards to timer behaviour (in contrast to nextTick() behaviour) and return values.

Edit:
Notably absent is any concept of minimum resolution for timers, I was speaking with @nrk about this on IRC and we talked about the possibility of allowing users to configure the resolution, perhaps that is now unnecessary.

* ExtEventLoop - based on the ext-event PHP extension
* StreamSelectNextTickLoop - based on stream_select
@nrk
Copy link
Member

nrk commented Nov 13, 2013

Thanks for the effort @jmalloc! I wanted to give nextTick() a try myself, but couldn't find the time to start experimenting in recent months so this PR is much appreciated.

Right before entering a detailed review of the code as it would take time to do it properly, I have a couple of questions aimed at better understanding the reasons for some choices:

  1. Why not reusing React\EventLoop\Timer\Timers in the stream_select()-based loop to keep the logic for timers separate from the loop implementation? You can just drop the minimum resolution check.
  2. On a related note, wouldn't it be better to extract the logic needed to support nextTick() and avoid an abstract loop class altogether?
  3. I'm not a fan of having a specialized interface just to expose a nextTick() method, not at this point at least. In the past, admittedly at very early stages, we left methods related to timers out of the event loop interface until we had them implemented for all our loops so I'd just do the same in this case too. Can we just leave the check at runtime using method_exists() and delay the decision for having an additional interface until we have better indications on the actual feasibility of having nextTick() supported everywhere? @igorw?

Strictly related to my third point, in all honesty I'm still of the idea that we should aim to have nextTick() supported for all of our loops or simply live without it. No gray area, no mixed support. Maybe my position is a bit extreme, but it's for the sake of consistency.

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 13, 2013

Thanks, and you're welcome :) I had found myself adding undesirable hacks to my own projects to try to emulate nextTick() so I thought it was about time I tried to do it properly.

  1. There's probably no reason not to do as you suggest. I had assumed that Timers was more of an implementation detail of the current StreamSelectLoop, as to my knowledge all the other loop implementations can/should/must use their own timer systems. I'll give this a go and see if there's any problem.
  2. The logic is - by definition - quite intertwined with the 'tick' logic of the loop. The intent was to let the abstract class handle the finer points of this and let the concrete implementation get on with the specifics of the timers and streams. I'm certainly not opposed to extracting the queue processing logic, perhaps a trait would suit better?
  3. This depends on how important nextTick() is to everyone. I think it's quite necessary, so if there are loop implementations that cannot support nextTick() but are absolutely worth retaining for other reasons then there should probably be an interface. Ultimately this is not for me to decide, but I would like to mention that I had also hoped to tackle registering event handlers for process signals, which is another feature that might only be supported by some of the loop implementations (but kind of needs to be supported by the loop, since that's what's doing the blocking - see Keyboard interrupt signal problem #221).
  4. You also mentioned on IRC that at a glance the code was too heavy on the method dispatching which might have some performance implications. I hadn't really considered this so I'm happy to reduce the amount of method abstraction if that's the approach that is preferred.

@nrk
Copy link
Member

nrk commented Nov 13, 2013

  1. Having a separate Timers class makes the logic driving timers self-contained, easier to understand and easily testable. In addition to that, while it's true that it's used only in StreamSelectLoop I wouldn't call it an implementation detail in a strict sense because it's not tied to the inner workings of the loop and could eventually be reused by 3rd parties libraries (sure, it's more of a nice side effect).

  2. I'm not really against having an abstract class, more like I was thinking if there's enough useful code to be reused after trimming it a bit (partly related to the fourth point in your reply). Just to name a few cases:

    • If we add an external NextTickManager class we can move flushNextTickQueue() there, using a different name. Ideally NextTickManager would hide some implementation details that have nothing to do with the event loop, we could even think about adding support for NodeJS's maxTickDepth nicely wrapping all the needed logic inside that class.
    • addTimer() and addPeriodicTimer() don't provide any actual benefit if we consider that each loop class must provide its own backend-dependent implementation of scheduleTimer() anyway.
    • The streamKey() method doesn't wrap enough code to be considered useful and simply adds overhead to a basic integer casting operation that is a very quick in userland code.

    If we still feel the need for an abstract class even after all the trimming, then let's just have it :-)

  3. I think it is important, to the point that I'd rather drop support for an event loop backend if it would mean having it implemented in all the remaining ones. On the other hand, having nextTick() implemented only for a couple of event loops would be kind of confusing and in this respect the very existence of React\EventLoop\Factory further complicates things. My point is: an additional interface just makes the fact that we lack consistency more explicit, so let's focus more on the practical implications ;-) By the way, I'd leave the matter of signals handling for another time but the reasoning is similar.

  4. While early optimizations are the root of all evil is generally true, in our case we should strive to optimize as much as possible because we are doing many things in userland (especially in the case of StreamSelectLoop) and things can get quite busy inside an event loop. Method dispatching is relatively expensive, so reducing method invocations to a reasonable minimum amount is desirable.

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 13, 2013

Everything you mention here sounds perfectly reasonable, I'll update the implementation and we can move from there.

I wont be attempting signal handling in this branch/PR, but I do need it for my own purposes so I figured I'd start another discussion/PR for that later unless it's something that's outright unwanted in React (which would be a shame since it'd mean I'd have to have custom loop implementations again :P)

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 13, 2013

1, 2, 3 + 4: I've made a quick attempt at improving some of the issues you've raised. At least for the moment I've retained both the abstract class (though smaller) and the interface. I'll do my best to get to libev/ uv implementations sooner rather than later so that we can get a better idea of how feasible it is to just add nextTick() to LoopInterface.

5: In regards to maxTickDepth, I deliberately omitted that from my original implementation for lack of understanding. I'm not sure, but it seems like it entirely breaks the guarantees of nextTick() at what would seem like 'random' points to the user, though I can see how you might end up with CPU-bound code that appears to be cooperative by deferring using nextTick() but is in-fact blocking other events completely. NodeJS also offers setImmediate() as a way to defer with no delay and without starving IO, (btw, I think this name is terrible).

Edit: clarity.

*/
protected function isEmpty()
{
return $this->timers->isEmpty()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if $this->readStreams and $this->writeStreams have no elements is a cheaper operation than doing the same for $this->timers, we should really move $this->timers->isEmpty() to the rightmost side of the short-circuit evaluation so that it gets executed only if the two arrays are empty:

return !$this->readStreams && !$this->writeStreams && $this->timers->isEmpty();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted, in the most recent pushed I have endeavoured to ensure that least-costly checks are always performed first.

@nrk
Copy link
Member

nrk commented Nov 14, 2013

5: In regards to maxTickDepth, I deliberately omitted that from my original implementation for lack of understanding.

Sure it makes perfectly sense to omit it right now, we should focus on getting nextTick() to work first.

I'm not sure, but it seems like it entirely breaks the guarantees of nextTick() at what would seem like 'random' points to the user, though I can see how you might end up with CPU-bound code that appears to be cooperative by deferring using nextTick() but is in-fact blocking other events completely.

The introduction of maxTickDepth stirred quite a fuss in NodeJS for the same exact reason when it was proposed, but it's understandable how they needed it at a certain point. Unfortunately the very name of nextTick() is what instills a certain guarantee of when your callbacks will be executed, but maxTickDepth can indeed betray your expectations. nextishTick()? 😄

NodeJS also offers setImmediate() as a way to defer with no delay and without starving IO, (btw, I think this name is terrible).

I do agree that setImmediate() is a terrible name, but I honestly can't think of any better alternative. Because yes, you know, it's not that we must necessarily follow the same naming used in NodeJS. But anyway, we'll eventually tackle setImmediate() after merging this PR.

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 14, 2013

Thanks for the extended feedback @nrk, I hope to get a chance to make some more progress on this today :)

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 15, 2013

I've just pushed the next round of changes, there's a bit to go over but I've got to leave the office at the moment. I'll provide a summary of the changes as soon as I get a chance.

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 15, 2013

  • Removed NextTickLoopInterface and AbstractNextTickLoop
  • Replaced existing StreamSelectLoop with the one from this PR
  • Implemented changes based on comments from @nrk

Removing the abstract base class had the intended effect of allowing for some optimisations. I've tried to be mindful of performing least-cost operations first, unrolling unnecessary loops, keeping dispatch to a minimum etc, though I'm sure there is further room for improvement.

I have a question about the way the closure-destroys-self problem is handled in the existing LibEventLoop implementation. The callback for streams is kept in the loop object itself, whereas the timers callback keeps a reference to itself. Is there any reason for the difference, is there a preferred solution? In this PR I've used the first solution for both streams and timers.

}
}

return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious return true in waitForStreamActivity? It doesn't seem to be used anywhere, and I don't see how it could be useful anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops! a leftover appendage from some intermediate refactoring. Gone now.

@cboden
Copy link
Member

cboden commented Nov 24, 2013

The first test I do is the Autobahn WebSocket Fuzzing test. I'll see if I can make one with a php client + server w/o minimal dependancies.

@nrk
Copy link
Member

nrk commented Nov 25, 2013

I'm not able to reproduce any weird issue using wrk against a basic HTTP server emitting a 65k payload on each response.

EDIT: PHP 5.5.3-1ubuntu2 (cli) (built: Oct 9 2013 14:49:24)

This is the output of wrk using LibEventLoop and a 128kb payload on each request:

Running 2m test @ http://127.0.0.1:1337/
  24 threads and 200 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     2.52s     7.32s    0.91m    89.47%
    Req/Sec    92.38     79.93   527.00     65.60%
  235747 requests in 2.00m, 28.81GB read
  Socket errors: connect 0, read 235933, write 0, timeout 8476
Requests/sec:   1964.19
Transfer/sec:    245.80MB

@nrk
Copy link
Member

nrk commented Nov 28, 2013

One last thing I'd discuss before thinking about merging this PR, more like a conceptual issue: we have a mismatch with the visibility of member variables and methods in our loop classes, in this case there's no point in having protected methods when all of our member variables are private.

The only valid reason for having private members would be to prevent developers from accessing certain core parts of the base classes when extended, so we should either decide if we want to seal our classes for good by marking them as final or open them up by making their member variables protected.

When I ask myself "what's the point of extending one of the loop classes?" I can't really find a convincing answer so I'd just mark them final. We have an interface, it's better for developers to write their own loop class from scratch.

Thoughts?

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 28, 2013

I agree that there isn't any apparent reason to extend these implementations for production code. I think all of the methods and properties that are currently protected should be private.

On the flip-side, PHPUnit, Phake, etc extend from classes to produce mocks, and for that reason I tend to avoid the use of final for any class that may need to be mocked. This is only a problem if someone (perhaps outside the React project) uses one of the concrete classes as a type-hint, so it's pretty esoteric I admit, but it's a problem I've come across before.

@cboden
Copy link
Member

cboden commented Nov 28, 2013

As per our discussion in IRC my above issue was caused from a miss-managed merge conflict by yours truly.

I do want to discuss another topic before this is merged and that is naming convention. nextTick (I'm assuming) was picked after NodeJS' nextTick. This adds an event to the start of the event loop. The next logical feature would be to add an event to the end of the event loop. In Node this is called setImmediate. I believe these naming conventions came from trying to be somewhat consistent with setTimeout and setInterval from JavaScript's early days.

Python 3.4 introduced a new async API and their equivalent methods are call_soon and call_later.

I find those method names to be far more intuitive. I'm not settled on them, but I'd like to open up a discussion to set a name on these methods for React. Looking at the API I wouldn't imagine setImmediate would run after nextTick, timers that are ready to be executed, and streams that are ready for reading. Apparently "immediately" means after everything else in the queue is done.

@clue
Copy link
Member

clue commented Nov 28, 2013

what's the point of extending one of the loop classes

For example extending StreamSelectLoop in order to use socket_select() instead of stream_select() (https://github.com/clue/socket-react). Admittedly this should not be part of react's core (check #191 if you dare), but what's the point in locking out future extensions just for the sake of it? :)

Encouraging a SOLID design and discouraging sub-classing is a good thing IMHO, but intentionally inhibiting sounds like a bad idea to me.

+1 for keeping this as-is. The PR looks really good to me, can't wait to give it a try 👍

@jmalloc
Copy link
Contributor Author

jmalloc commented Nov 28, 2013

I do want to discuss another topic before this is merged and that is naming convention. nextTick (I'm assuming) was picked after NodeJS' nextTick. This adds an event to the start of the event loop. The next logical feature would be to add an event to the end of the event loop. In Node this is called setImmediate. I believe these naming conventions came from trying to be somewhat consistent with setTimeout and setInterval from JavaScript's early days.

IIRC setImmediate also differs in that it can not starve I/O. One callback from the setImmediate queue is invoked per-tick, whereas callbacks queued with nextTick are appended to the queue that is currently being processed. There's definitely value in coming up with names that reflect these differences. nextTick is accurate IMO, but I'm not attached to it if we can come up with something better. I've already stated that I think setImmediate is just terrible.

Edit on reflection, nextTick actually means 'this tick' when called within another next-tick callback, so it's not strictly accurate.

@nrk
Copy link
Member

nrk commented Nov 29, 2013

@clue

For example extending StreamSelectLoop in order to use socket_select() instead of stream_select() (https://github.com/clue/socket-react). Admittedly this should not be part of react's core (check #191 if you dare), but what's the point in locking out future extensions just for the sake of it? :)

Fair enough, your example of StreamSelectLoop is practical and I wasn't considering such a scenario.

Encouraging a SOLID design and discouraging sub-classing is a good thing IMHO, but intentionally inhibiting sounds like a bad idea to me.

OK let's drop the idea of marking loop classes final, we still have a strong visibility mismatch between member methods and variables. One example for all: ExtEventLoop::createStreamCallback() is a protected method but basically can't be overridden because $this->readListeners and $this->writeListeners are private thus not accessible. Most of the protected methods in our loop classes fall into this category, but as a counter-example StreamSelectLoop::streamSelect() doesn't depend on any member variable so leaving it protected is OK. Sounds reasonable?

@jmalloc

On the flip-side, PHPUnit, Phake, etc extend from classes to produce mocks, and for that reason I tend to avoid the use of final for any class that may need to be mocked. This is only a problem if someone (perhaps outside the React project) uses one of the concrete classes as a type-hint, so it's pretty esoteric I admit, but it's a problem I've come across before.

Testing shouldn't really be the driving force behind these kind of decisions, furthermore in this specific case using a concrete loop class instead of the interface for type-hinting would be so blatantly weird more than esoteric. Well anyway, let's keep classes as they are without final.

@nrk
Copy link
Member

nrk commented Nov 29, 2013

About naming...

@cboden

Python 3.4 introduced a new async API and their equivalent methods are call_soon and call_later.

As I've said before, probably on IRC, we definitely can (and probably should) come up with our own method names.

I find those method names to be far more intuitive. I'm not settled on them, but I'd like to open up a discussion to set a name on these methods for React. Looking at the API I wouldn't imagine setImmediate would run after nextTick, timers that are ready to be executed, and streams that are ready for reading. Apparently "immediately" means after everything else in the queue is done.

We actually all agree on setImmediate being a terrible name. nextTick is better in that respect despite not being entirely accurate. While I'm not crazy about the names call_soon and call_later, they have a more abstract feeling to them as they don't try so hard to describe at which point of the loop execution the associated callbacks will be executed (and I'm starting to think it's pretty much impossible to achieve that in a short method name).

Having said that, I still can't think of any decent alternative.

@cboden
Copy link
Member

cboden commented Dec 1, 2013

I like the Python names better than the Node names and would like to see them if we can't come up with a better name ourselves. The floor is open to all suggestions.

@jmalloc sudo pecl install event

pecl/event conflicts with package "pecl/libevent" (version >= 0.0.2), installed version is 0.1.0

Do you know any way around this? For testing purposes it would be nice to be able to have both installed.

@jmalloc
Copy link
Contributor Author

jmalloc commented Dec 1, 2013

@cboden

pecl/event conflicts with package "pecl/libevent" (version >= 0.0.2), installed version is 0.1.0

I haven't tried, sorry - I've been testing pecl/event on OSX and pecl/libevent on the Vagrant box.

Python 3.4 introduced a new async API and their equivalent methods are call_soon and call_later.

Unless I missed it, this PEP doesn't appear to define an equivalent to setImmediate, rather call_later is equivalent to setTimeout. It's probably a bad idea to use those names but repurpose them.

@nrk
Copy link
Member

nrk commented Dec 2, 2013

@cboden & @jmalloc

pecl/event conflicts with package "pecl/libevent" (version >= 0.0.2), installed version is 0.1.0

Skip pecl and build from sources. Inconvenient, but it works.

Python 3.4 introduced a new async API and their equivalent methods are call_soon and call_later.

Unless I missed it, this PEP doesn't appear to define an equivalent to setImmediate, rather call_later is equivalent to setTimeout. It's probably a bad idea to use those names but repurpose them.

Finally took a look at the above mentioned PEP and as @jmalloc says there doesn't seem to be an equivalent to setImmediate(). Anyway setImmediate() is not a priority for us right now, we should discuss about it in a different PR as we still need to decide if we really want it or not.

If everyone agrees, I'd take a few more days of testing so that we can think about the name nextTick in the meanwhile.

@nrk
Copy link
Member

nrk commented Dec 4, 2013

@jmalloc: as proposed by @igorw on IRC, it would be useful to have this pull request split in half:

  • one with only the changes needed to add nextTick to the already existing loops (could be this very same PR)
  • and a new one dedicated solely to adding ExtEventLoop

By doing so we could merge nextTick as soon as possible (and by soon, I mean real soon ;-)) and have more time to test ExtEventLoop (as requested especially by @cboden) since it's a whole new extension to us.

Would it be OK for you?

@jmalloc
Copy link
Contributor Author

jmalloc commented Dec 4, 2013

No problem, I'll do this today.

@jmalloc
Copy link
Contributor Author

jmalloc commented Dec 4, 2013

I'll wait until this is merged before submitting a new PR for ExtEventLoop since it will include nextTick.

@nrk nrk merged commit edc337e into reactphp:master Dec 5, 2013
@nrk
Copy link
Member

nrk commented Dec 5, 2013

Guess what? :-)

@igorw
Copy link
Contributor

igorw commented Dec 5, 2013

Thanks @jmalloc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy