apply - R - count at each row the number of columns in the "row neighbourhood" that contain only NA -

- September 15, 2015

how create data frame vector gives each row number of columns "na" (or custom value) in row , n rows above , m rows below.

so if m = n = 1 (i.e. how many columns in each row na , have na before , after) , dataframe is

structure(list(x = 1:8, = c(3l, na, 10l, na, 6l, na, 5l, na ), b = c(6l, na, na, na, 8l, na, 13l, na), c = c(na, 12l, 14l,   na, na, na, 9l, na), d = c(na, na, na, na, na, 11l, 7l, na)), .names = c("x",  "a", "b", "c", "d"), class = "data.frame", row.names = c(na,  -8l))

i.e.

 t x   b  c  d 1 1  3  6 na na  2 2 na na 12 na  3 3 10 na 14 na  4 4 na na na na 5 5  6  8 na na 6 6 na na na 11 7 7  5 13  9  7 8 8 na na na na

i want vector

count 0 1 2 1 1 0 0 0

(if first , last entries na's that's fine). i'm trying mimic countifs function in excel, i.e. countifs(b2:f2,"",b3:f3,"",b4:f4,"") row 3.

i think mean.

suppose dataframe called x.

first, each (row, column) in x, need see if there na in cell, , na in same column n rows before , m rows after.

first, let's in case of single row, row i = 2 say. have n = 1 , m = 1 (from example in question).

i <- 2 n <- 1 m <- 1

let's count number of nas in each column rows i - n i + m inclusive (is.na returns true if current value na, colsums gives column sums)

y <- colsums(is.na(x[(i - n):(i + m), ])) # x b c d  # 0 1 2 1 3

now have na in previous, current, , next row if counted 3 nas (i.e. column d qualifies here):

y == n + m + 1 #     x         b     c     d  # false false false false  true

so number of columns satisfy our criteria (hence ith element of output) is:

sum(y == n + m + 1) # 1

we can use sapply apply on each row:

countifs <- function (df, n, m) {     sapply(1:nrow(df),            function (i) {                nrows <- nrow(df)                startrow <- max(i - n, 1)                endrow   <- min(i + m, nrows)                y <- colsums(is.na(x[startrow:endrow, ]))                sum(y == n + m + 1)            }) }  countifs(x, 1, 1) # [1] 0 1 2 1 1 0 0 0

you mentioned might want compare custom value rather na. in case, instead of doing is.na(x[...]), can x[...] == value (but not if value na, in use is.na)

also, save bit of work using sapply on rows n + 1 nrow(df) - m - 1 , setting first n , last m elements 0 automatically.

Search This Blog

Kiastu

apply - R - count at each row the number of columns in the "row neighbourhood" that contain only NA -

Comments

Post a Comment

Popular posts from this blog

android - getbluetoothservice() called with no bluetoothmanagercallback -

javascript - Image onload event not firing in firefox -

sql - ASP.NET SqlDataSource, like on SelectCommand -