Shortcuts

Events

Module contains events processing mechanisms that are integrated with the standard python logging.

Example of usage:

from torch.distributed.elastic import events

event = events.Event(
    name="test_event", source=events.EventSource.WORKER, metadata={...}
)
events.get_logging_handler(destination="console").info(event)

API Methods

torch.distributed.elastic.events.record(event, destination='null')[source][source]
torch.distributed.elastic.events.construct_and_record_rdzv_event(run_id, message, node_state, name='', hostname='', pid=None, master_endpoint='', local_id=None, rank=None)[source][source]

Initialize rendezvous event object and record its operations.

Parameters
  • run_id (str) – The run id of the rendezvous.

  • message (str) – The message describing the event.

  • node_state (NodeState) – The state of the node (INIT, RUNNING, SUCCEEDED, FAILED).

  • name (str) – Event name. (E.g. Current action being performed).

  • hostname (str) – Hostname of the node.

  • pid (Optional[int]) – The process id of the node.

  • master_endpoint (str) – The master endpoint for the rendezvous store, if known.

  • local_id (Optional[int]) – The local_id of the node, if defined in dynamic_rendezvous.py

  • rank (Optional[int]) – The rank of the node, if known.

Returns

None

Return type

None

Example

>>> # See DynamicRendezvousHandler class
>>> def _record(
...     self,
...     message: str,
...     node_state: NodeState = NodeState.RUNNING,
...     rank: Optional[int] = None,
... ) -> None:
...     construct_and_record_rdzv_event(
...         name=f"{self.__class__.__name__}.{get_method_name()}",
...         run_id=self._settings.run_id,
...         message=message,
...         node_state=node_state,
...         hostname=self._this_node.addr,
...         pid=self._this_node.pid,
...         local_id=self._this_node.local_id,
...         rank=rank,
...     )
torch.distributed.elastic.events.get_logging_handler(destination='null')[source][source]
Return type

Handler

Event Objects

class torch.distributed.elastic.events.api.Event(name, source, timestamp=0, metadata=<factory>)[source][source]

The class represents the generic event that occurs during the torchelastic job execution.

The event can be any kind of meaningful action.

Parameters
  • name (str) – event name.

  • source (EventSource) – the event producer, e.g. agent or worker

  • timestamp (int) – timestamp in milliseconds when event occurred.

  • metadata (dict[str, Union[str, int, float, bool, NoneType]]) – additional data that is associated with the event.

class torch.distributed.elastic.events.api.EventSource(value)[source][source]

Known identifiers of the event producers.

torch.distributed.elastic.events.api.EventMetadataValue

alias of Optional[Union[str, int, float, bool]]

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy