Re: [問題] Json to R and Data talbe/matrix arrange

作者: peterwu76 (金岡)   2017-06-07 09:32:12
感謝 celestialgod 前輩的幫忙
附上最後的code for download historical data,這樣也可以讓前輩了解我想做什麼。
好讀版
https://hackmd.io/KwU2CMDMENQWmAZgEwGM4BZrmXcB2AThDmn2GQBNxKAGDVYYIA==?both
備註說明:
原本也想考慮抓Realtime數據,結果發現抓出來的資料時間跟歷史資料最新的data時間
是一樣的,抓取資料的時間點跟最新能抓到的資料的時間點相隔8小時左右。
舉例來說,我09:00使用此code抓資料,抓出的資料最後(最新的)一筆資料的時間大概
是01:00。
所以最後放棄所謂Realtime抓資料的想法,不如抓歷史historical資料。
MY Final codes for historical data
# Get data from Json link
library(jsonlite)
url <- "https://data.lass-net.org/data/history.php?device_id=74DA38C7D1D2"
x <- fromJSON(url)
# Arrange data into table matrix
library(data.table)
library(lubridate)
outDT <- rbindlist(x$feeds$AirBox)
# Claim the timestamp into correct time format by lubridate package
outDT[ , `:=`(source = x$source, version = ymd_hms(x$version),
device_id = x$device_id, timestamp = ymd_hms(timestamp))]
sortD <- outDT
# Capture data by column names
headers<-c("timestamp","s_d0","s_t0","s_h0","date","time","device_id","gps_lon","gps_lat","version")
sortD <- subset(outDT,select=headers)
# rename column names
colnames(sortD)[which(names(sortD) == "s_d0")] <- "PM2.5"
colnames(sortD)[which(names(sortD) == "s_t0")] <- "Temperature"
colnames(sortD)[which(names(sortD) == "s_h0")] <- "Humidity"
# Sort data
sortD$timestamp <- as.POSIXct(sortD$timestamp, tz='UTC')
class(sortD$timestamp)
Final_data<-sortD[order(sortD$timestamp)]
View(Final_data)
# Output data
date <- sprintf("AirBox_74DA38C7D1D2_%s.csv", format(Sys.time(),"%Y%m%d%H%M"))
outfile = paste("D:\\AirBoxTest\\", date, sep = "")
write.csv(Final_data, file = outfile)
#################################
# Hourly average and output
#################################
Final_data_hourly<- aggregate(list(PM2.5 = Final_data$PM2.5,
Humidity = Final_data$Humidity,
Temperature = Final_data$Temperature),
list(hourofday = cut(Final_data$timestamp, "1 hour")),
mean)
# Output data
date <- sprintf("AirBox_74DA38C7D1D2_Hourly_%s.csv",
format(Sys.time(),"%Y%m%d%H%M"))
outfile = paste("D:\\AirBoxTest\\", date, sep = "")
write.csv(Final_data_hourly, file = outfile)
作者: carl090105 (Jing)   2017-06-07 12:47:00
提供兩點看法1. subset 可用: outDT[, c(headers), with=FALSE]2. rename 可用: data.table::setnames
作者: peterwu76 (金岡)   2017-06-07 14:19:00
感謝提供更好的方法!!
作者: clansoda (小笨)   2017-06-07 16:00:00
first point could be outDT[, .(headers)]it could be more concise
作者: carl090105 (Jing)   2017-06-07 20:26:00
I think if header is a string vector then indata.table v1.10.2 can use DT[, ..header]
作者: celestialgod (天)   2017-06-07 20:29:00
rename那裏其實用match可以一次改三個名字你那樣做會複製三次data.frame滿可怕的
作者: peterwu76 (金岡)   2017-06-10 18:14:00
Thanks! I will try to revise it! :)

Links booklink

Contact Us: admin [ a t ] ucptt.com