understanding the reference properties of data.table in R -


just clear stuff myself, better understand when copies made , when not in data.table. question points out understanding when data.table reference (vs copy of) data.table, if 1 runs following end modifying original:

library(data.table)  dt <- data.table(a=c(1,2), b=c(11,12)) print(dt) #       b # [1,] 1 11 # [2,] 2 12  newdt <- dt        # reference, not copy newdt[1, := 100] # modify new dt  print(dt)          # dt modified too. #         b # [1,] 100 11 # [2,]   2 12 

however, if 1 (for example), end modifying new version:

dt = data.table(a=1:10) dt       1:  1  2:  2  3:  3  4:  4  5:  5  6:  6  7:  7  8:  8  9:  9 10: 10  newdt = dt[a<11] newdt       1:  1  2:  2  3:  3  4:  4  5:  5  6:  6  7:  7  8:  8  9:  9 10: 10  newdt[1:5,a:=0l]  newdt       1:  0  2:  0  3:  0  4:  0  5:  0  6:  6  7:  7  8:  8  9:  9 10: 10  dt       1:  1  2:  2  3:  3  4:  4  5:  5  6:  6  7:  7  8:  8  9:  9 10: 10 

as understand it, reason happens because when execute i statement, data.table returns whole new table opposed reference memory occupied select elements of old data.table. correct , true?

edit: sorry meant i not j (changed above)

when create newdt in second example, evaluating i(not j). := assigns reference within j argument. there no equivalents in i statement, self reference on allocates columns, not rows.

a data.table list. has length == number of columns, on allocated can add more columns without copying entire table (eg using := in j)

if inspect data.table, can see truelength (tl = 100) -- numbe of column pointer slots

 .internal(inspect(dt)) @1427d6c8 19 vecsxp g0c7 [obj,nam(2),att] (len=1, tl=100)   @b249a30 13 intsxp g0c4 [nam(2)] (len=10, tl=0) 1,2,3,4,5,... 

within data.table each element has length 10, , tl=0. there no method increase truelength of columns allow appending rows reference.

from ?truelength

currently, it's list vector of column pointers over-allocated (i.e. truelength(dt)), not column vectors themselves, in future allow fast row insert()

when evaluate i, data.table doesn't check whether have returned rows in same order in original (and not copy in case), returns copy.


Comments

Popular posts from this blog

android - getbluetoothservice() called with no bluetoothmanagercallback -

sql - ASP.NET SqlDataSource, like on SelectCommand -

ios - Undefined symbols for architecture armv7: "_OBJC_CLASS_$_SSZipArchive" -