Incorrect nanosecond timestamps being written to influxdb

Writing point with time precision  in ns results in incorrect nanosecond timestamps being written to influxdb
effect can be noticed in writing points from both the  `DataFrameClient.writepoints`as well as the `InfluxDBClient.write`
This has been noted in issues: #527,#489, #346 and #344 and #340 

  and a proposed pull request submitted. [https://github.com/influxdata/influxdb-python/pull/346](https://github.com/influxdata/influxdb-python/pull/346) .  I believe this pull request does not go far enough

#489 describes the same "floor divide" solution as part of the solution being proposed here. 

This solution I am proposing in a pull request will use, floor divide and  pandas.Timestamp that supports ns resolution.

### Implication
In trying to use the library to update a record it writes a new record with the erroneous timestamp

### Effected code regions
`_dataframe_client` 
   _convert_dataframe_to_json
   _convert_dataframe_to_lines
   _datetime_to_epoch
`line_protocol.py`  
  _convert_timestamp

### Explanation:
the divide of an (np.int64 or int ) nanosecond timepoint by 1 (ns precision)  is producing errors. it is necessary to use the floor divide operator // which yields the correct (np.int64 or int) value

If an np.int64 is divided (Operator /)  by an int  it yields a float64. When the initial  np.int64 value is large enough the conversion  back to an np.int64 looses nanosecond precision. The same error occurs with standard python int types.

### Example (formulated in a ipython notebook)


    .......... dataframe  is from a query (influxdb measurement having particular field value errors) returned 2 timepoints
    dataframe.index  --->   DatetimeIndex(['2018-05-29 00:53:12.889962156+00:00', '2018-10-03 07:06:36.975643599+00:00'], dtype='datetime64[ns, UTC]', freq=None)
    dataframe.index[1]  ---> Timestamp('2018-10-03 07:06:36.975643599+0000', tz='UTC')
    dataframe.index[1].value  ---> 1538550396975643599
    
    ######    alternative method for timepoint fabrication   #########
    timepoint = pd.Timestamp('2018-10-03 07:06:36.975643599+0000', tz='UTC')
    timepoint.value   ---> 1538550396975643599

    type(timepoint.value)   ---> int
    timepoint.value / 1  ----> 1.5385503969756436e+18
    type(timepoint.value / 1) ----> float
    type(timepoint.value // 1) ----> int
    np.int64(timepoint.value // 1)   ---> 1538550396975643599
    np.int64(timepoint.value / 1)  ---> 1538550396975643648

Error(ns) is 

    np.int64(timepoint.value // 1) - np.int64(timepoint.value / 1)  ---> -49


######## Check using unit test timepoint  ################

    EPOCH = pd.Timestamp('1970-01-01 00:00+00:00')
    nowplus1h = EPOCH + pd.Timedelta('1 hour')
    nowplus1h.value ---> 3600000000000
    nowplus1h.value / 1 ---> 3600000000000.0
    np.int64(nowplus1h.value / 1) ---> 3600000000000

No Error in this 

######### Suggested  unit test timepoint  ################

    futuretimepoint = EPOCH + pd.Timedelta('20000 day  +23:10:55.123456789')
    futuretimepoint.value ---> 1728083455123456789
    futuretimepoint.value / 1 ---> 1.7280834551234568e+18
    np.int64(futuretimepoint.value / 1) ---> 1728083455123456768

Error(ns) is

    futuretimepoint.value - np.int64(futuretimepoint.value / 1) ---> 21

  

all locations where timepoints are calculated need modification to yield the expected result in nanosecond precision
the unittests do not show this up as the 2 test timepoints are small enough to not show the loss of precision
points are EPOCH  '1970-01-01 00:00+00:00'   EPOCH + 1 hour.   These 2 test points have only microsecond resolution with no nano second component.
I propose the test case  2  to be something like  EPOCH + 20000days and 23h 10m 55.123456789s. It is necessary to change all calculations based on datetime (only has microsecond resolution)  to pandas Timedelta and Timestamp (these have nanosecond resolution)

I am preparing a pull request that attempts to address all timestamp calculations and fixes the unittests


Product version where issue discovered and where fixes are being tested

    pd.show_versions()
    INSTALLED VERSIONS
    ------------------
    commit: None
    python: 3.6.2.final.0
    python-bits: 64
    OS: Windows
    OS-release: 10
    machine: AMD64
    processor: Intel64 Family 6 Model 70 Stepping 1, GenuineIntel
    byteorder: little
    LC_ALL: None
    LANG: None
    LOCALE: None.None

    pandas: 0.20.3
    pytest: None
    pip: 9.0.1
    setuptools: 36.4.0
    Cython: None
    numpy: 1.13.1
    scipy: 0.19.1
    xarray: None
    IPython: 6.2.1
    sphinx: None
    patsy: None
    dateutil: 2.6.1
    pytz: 2017.2
    ............................................
    ...........
    ..........................

np.version.full_version




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Incorrect nanosecond timestamps being written to influxdb #649

Implication

Effected code regions

Explanation:

Example (formulated in a ipython notebook)

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Incorrect nanosecond timestamps being written to influxdb #649

Description

Implication

Effected code regions

Explanation:

Example (formulated in a ipython notebook)

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.