Issue
There is a sample csv data like (real data is in millisecond percision)
using TimeSeries, Plots
s="DateTime,Open,High,Low,Close,Volume
2020/01/05 16:14:01,20,23,19,20,30
2020/01/05 16:14:11,23,27,19,22,20
2020/01/05 17:14:01,24,28,19,23,10
2020/01/05 18:14:01,25,29,20,24,40
2020/01/06 08:02:01,26,30,22,25,50"
ta=readtimearray(IOBuffer(s),format="yyyy/mm/dd HH:MM:SS")
plot(ta.Volume)
I found the package TimeSeries
and Temporal
are based on daily plot. Is there any easy way to aggregate them into minutes/hourly/daily/weekly... and plot them?
- For the Open value, it should keep the first value during the period.
- For the High value, it should be the maximum value during the period.
- For the Low value, it should be the minimum value during the period.
- For the Close value, it should be the last value during the period.
- For the Volume value, it should be the sum value during the period.
I expect it could the volume like tb
s="DateTime,Volume
2020/01/05 16:00:00,50
2020/01/05 17:00:00,10
2020/01/05 18:00:00,40
2020/01/06 08:00:00,50"
tb=readtimearray(IOBuffer(s),format="yyyy/mm/dd HH:MM:SS")
plot(tb.Volume)
Solution
Method 1: I found a workable but not perfect method. For example, plot in hourly, by Volume
using DataFrames,Statistics,Dates
df = DataFrame(ta)
df.ms = Date.value.(df.timestamp)
df.hour = df.ms
df.hour = df.ms .÷ (60*60*1000)
df2 = aggregate(df[:, [:hour, :Volume]], :hour, sum)
df2.timestamp = convert.(DateTime, Dates.Millisecond.(df2.hour.*(60*60*1000)))
tb=TimeArray(df2[:,[:timestamp,:Volume_sum]], timestamp=:timestamp)
plot(tb)
the content of tb
4×1 TimeArray{Float64,1,DateTime,Array{Float64,1}} 2020-01-05T16:00:00 to 2020-01-06T08:00:00
│ │ Volume_sum │
├─────────────────────┼────────────┤
│ 2020-01-05T16:00:00 │ 50.0 │
│ 2020-01-05T17:00:00 │ 10.0 │
│ 2020-01-05T18:00:00 │ 40.0 │
│ 2020-01-06T08:00:00 │ 50.0 │
Method 2: There seems a more easy way by floor
function
df.hour2 = floor.(df.timestamp, Dates.Hour(1))
df2 = aggregate(df[:, [:hour2, :Volume]], :hour2, sum)
tb=TimeArray(df2[:,[:hour2,:Volume_sum]], timestamp=:hour2)
Method 3: Just use collapse
second form syntax
using Statistics
tb1 = collapse(ta[:, :Open], hour, first, first)
tb2 = collapse(ta[:, :High], hour, first, maximum)
tb3 = collapse(ta[:, :Low], hour, first, minimum)
tb4 = collapse(ta[:, :Close], hour, first, last)
tb5 = collapse(ta[:, :Volume], hour, first, sum)
tb = merge(tb1, tb2, tb3, tb4, tb5)
Answered By - Daniel YC Lin
Answer Checked By - Timothy Miller (JavaFixing Admin)