Re: [問題] shiny讀取大檔案csv會當機

作者: celestialgod (天)   2017-10-21 22:42:58
※ 引述《Esmelee (Esme)》之銘言:
: 文章分類提示:
: - 問題:shiny
: [問題類型]:
: 程式諮詢(我想用R 做某件事情,但是我不知道要怎麼用R 寫出來)
: [軟體熟悉度]:
: 入門(寫過其他程式,只是對語法不熟悉)
: [問題敘述]:
: 我做了一個shiny讀取歷年資料的程式
: 可是我的csv檔案有2gb大
: 發現讀取可以,可是要用shiny+套件時就會當機
: 如果不用shiny就可以正常繪圖了
: 有人建議我用mysql可是我把資料弄進mysql,mysql就當機了
: 檔案太大嗎??只有2g阿
: 到底要怎麼用才能正常在shiny上跑呢
: 我的記憶體有16g
: 我是不是要用甚麼spark hadoop的就會跑得出來
: 而且後端檔案有2g大網站要怎麼做呢
: shinyapps.io好像不可能...
幫你做個簡單的修改,只是改用data.table
因為data.frame在過程中會不斷被複製,會增加記憶體使用量,拖慢速度
給個測試例子:https://pastebin.com/tHZ8UGpx
: [程式範例]:
: library(leaflet)
: library(shiny)
: library(shinydashboard)
: library(readr)
: library(methods)
: library(DT)
: library(RCurl)
library(data.table)
alldata_2010 <- fread("alldata_test.csv",
colClasses = c("PM2.5" = "numeric",
year = "character",
month = "character",
day = "character",
hour = "character"))
setkey(alldata_2010, year, month, day, hour)
: ui <-
: fluidPage(
: titlePanel("Basic DataTable"),
: fluidRow(
: column(4,
: selectInput("year",
: "year:", alldata_2010, selectize=TRUE)
: ),
: column(4,
: selectInput("month",
: "month:", alldata_2010, selectize=TRUE)
: ),
: column(4,
: selectInput("day",
: "day:", alldata_2010, selectize=TRUE)
: ),
: column(4,
: selectInput("hour",
: "hour:", alldata_2010, selectize=TRUE)
: ),
: fluidRow(
: title = "data MAP",
: collapsible = TRUE,
: width = "100%",
: height = "100%",
: leafletOutput("datamap", height = "900px")
: )
: )
: )
: server <-
: function(input, output) {
: output$datamap <- renderLeaflet({
data <- copy(alldata_2010)
: if (input$year != "All") {
data <- data[year == input$year]
: }
: if (input$month != "All") {
data <- data[month == input$month]
: }
: if (input$day != "All") {
data <- data[day == input$day]
: }
: if (input$hour != "All") {
data <- data[hour == input$hour]
: }
但是,最好的改法是下面這樣:
filter <- ""
for (colname in c("year", "month", "day", "hour")) {
if (input[[colname]] != "All")
filter <- sprintf("%s %s %s == %s", filter,
ifelse(ncahr(filter)>0,"&",""), colname, input[[colname]])
}
data <- alldata_2010[eval(filter)]
: cPal <- colorNumeric(palette =
: c("green","orange","red","purple"),domain = 0:100)
: leaflet(taiwan) %>% addProviderTiles(providers$CartoDB.Positron) %>%
: addPolygons(color = "#444444", weight = 1.5, smoothFactor = 1.5,
: opacity = 1.5, fillOpacity = 0.1) %>%
: addCircleMarkers(lng=data$TWD97Lon,lat=data$TWD97Lat,
: radius=13,stroke=FALSE, fillOpacity = 0.9,
: fillColor = ~cPal(data$PM2.5),
: label =~as.character(data$PM2.5),
: popup = ~as.character(data$site))%>%
: addLegend("bottomright", pal = cPal, values =data$PM2.5,
: title = "PM2.5",
: labFormat = labelFormat(suffix = " "),opacity = 1)
: }
: )
: }
: shinyApp(ui = ui, server = server)
: [環境敘述]:
: R version 3.3.3
: windows 10
: 記憶體16gb
: [關鍵字]:shiny 資料大
作者: Esmelee (Esme)   2017-10-22 11:22:00
不行欸,還是沒辦法,錯誤寫記憶體不足了cannot allocate vector of size 93.3 Mb修改memory.limit還是不行,我決定用sql試試
作者: psinqoo (零度空間)   2017-11-11 22:48:00
插話 windows 10 上做 shiny??

Links booklink

Contact Us: admin [ a t ] ucptt.com