r - Memory bottleneck in seqdist? -


is possible there memory bottleneck in seqdist()?

i'm researcher working register data on windows x64 computer 64 gb of ram. our data consists of 60,000 persons, , @ moment i'm working on data has 2.2 million lines in spell format. can't run seqdist on (method="om", indel=1, sm="trate", with.missing=true, full.matrix=false), error message same in here, important part seems point not large enough memory: "negative length vectors not allowed".

ok, seqdist() doesn't seem utilize whole ram. right i'm running on sample of 40,000 persons, , seems go through, r using less 2 gbs of ram. if run seqdist() on 60,000 persons, error.

might there size limit of 2^31-1 in there somewhere?

calculating ward clusters readily utilizes available ram. i've had use 40 gbs of ram, @ least proves r capable of utilizing large amounts of ram.

edit: maximum number of cases 46341. warning though, eats memory if size <= 46341. example:

library(traminer)  id <- seq(from=1, to=46342, by=1) set.seed(234324) time1 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true) time2 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true) time3 <- sample(seq(from=1, to=3, by=1), size=46342, replace=true)  testdata <- data.frame(id, time1, time2, time3)  testseq <- seqdef(testdata, 2:4)  testdist <- seqdist(testseq, method="om", indel=1, sm="trate", full.matrix=false) 


Comments

Popular posts from this blog

Hatching array of circles in AutoCAD using c# -

ios - UITEXTFIELD InputView Uipicker not working in swift -

Python Pig Latin Translator -