r - Memory bottleneck in seqdist? -
is possible there memory bottleneck in seqdist()?
i'm researcher working register data on windows x64 computer 64 gb of ram. our data consists of 60,000 persons, , @ moment i'm working on data has 2.2 million lines in spell format. can't run seqdist on (method="om", indel=1, sm="trate", with.missing=true, full.matrix=false), error message same in here, important part seems point not large enough memory: "negative length vectors not allowed".
ok, seqdist() doesn't seem utilize whole ram. right i'm running on sample of 40,000 persons, , seems go through, r using less 2 gbs of ram. if run seqdist() on 60,000 persons, error.
might there size limit of 2^31-1 in there somewhere?
calculating ward clusters readily utilizes available ram. i've had use 40 gbs of ram, @ least proves r capable of utilizing large amounts of ram.
edit: maximum number of cases 46341. warning though, eats memory if size <= 46341. example:
library(traminer) id <- seq(from=1, to=46342, by=1) set.seed(234324) time1 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true) time2 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true) time3 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true) testdata <- data.frame(id, time1, time2, time3) testseq <- seqdef(testdata, 2:4) testdist <- seqdist(testseq, method="om", indel=1, sm="trate", full.matrix=false)
Comments
Post a Comment