-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle also when header and data rows have different number of columns #189
Comments
👍 currently fiddling with an instance of Case 1: More column names than data columns. |
I think both cases should be problems, not errors. |
How does this look? read_csv(col_types = "ii", "a,b\n1")
#> a b
#> 1 1 NA
read_csv(col_types = "ii", "a,b\n1,2,3")
#> Warning: 1 problems parsing literal data. See problems(...) for more
#> details.
#> a b
#> 1 1 2
read_csv("a,b\n1")
#> a
#> 1 1
read_csv("a,b\n1,2,3")
#> a b X3
#> 1 1 2 3 I guess they probably should all generate warnings :/ |
A bit more progress read_csv(col_types = "ii", "a,b\n1")
#> a b
#> 1 1 NA
read_csv(col_types = "ii", "a,b\n1,2,3")
#> Warning: 1 parsing failure (literal data)
#> row col expected actual
#> 1 3 2 columns
#> a b
#> 1 1 2
read_csv("a,b\n1")
#> Warning: 1 parsing failure (literal data)
#> row col expected actual
#> NA NA 1 column names 2
#> a
#> 1 1
read_csv("a,b\n1,2,3")
#> Warning: 1 parsing failure (literal data)
#> row col expected actual
#> NA 3 Missing column name
#> a b X3
#> 1 1 2 3 |
Looks good to me. Yes, it is helpful to be warned when |
Final version: read_csv(col_types = "ii", "a,b\n1")
#> Warning: 1 parsing failure.
#> row col expected actual
#> 1 -- 2 columns 1 columns
#> a b
#> 1 1 NA
read_csv(col_types = "ii", "a,b\n1,2,3")
#> Warning: 1 parsing failure.
#> row col expected actual
#> 1 -- 2 columns 3 columns
#> a b
#> 1 1 2
read_csv(col_types = "ii", "a,b\n1,2,3,4")
#> Warning: 1 parsing failure.
#> row col expected actual
#> 1 -- 2 columns 4 columns
#> a b
#> 1 1 2
read_csv("a,b\n1")
#> Warning: 1 parsing failure.
#> row col expected actual
#> -- -- 1 col names 2 col names
#> a
#> 1 1
read_csv("a,b\n1,2,3")
#> Warning: 1 parsing failure.
#> row col expected actual
#> -- -- 3 col names 2 col names
#> a b X3
#> 1 1 2 3
read_csv("a,b\n1,2,3,4")
#> Warning: 1 parsing failure.
#> row col expected actual
#> -- -- 4 col names 2 col names
#> a b X3 X4
#> 1 1 2 3 4 This is looking pretty good to me :) (BTW I've been using reprex to make these code snippets and it's awesome!) |
Not quite right, but I'll finish it off tomorrow: read_csv("a,b\n\n2,3")
#> a b
#> 1 NA NA
#> 2 2 3
read_csv("a,b\n\n\n2,3")
#> Warning: 1 parsing failure.
#> row col expected actual
#> 2 -- 2 columns 1 columns
#> a b
#> 1 NA NA
#> 2 NA NA
#> 3 2 3 |
@HenrikBengtsson is the main sufferer but I agree this looks great. (Thanks for kind words re: reprex ... yeah, it certainly feels useful and ppl have given neat ideas and PRs already.) |
I'm pretty sure I got everything - please open a new issue if you discover a case I missed. |
Awesome - thanks for this. I've confirmed that it works with my real-world data that originally triggered this issue. You just made life a bit less hard for quite a few people. |
Case 1: More column names than data columns
read.table()
hasfill=TRUE
to handle the case for when there are more column names than columns in the data rows, e.g.Looking at the help, I don't think there is way to use
read_tsv()
to deal with this case.WISH: Make it possible "fill" data rows with empty values/NAs, when data rows lack trailing cells. This would assume the missing ones are at the end, cf. argument
fill
ofread.table()
.Case 2: Fewer column names than data columns
read.table()
does not handle this. I don't thinkread_tsv()
does either.WISH: Make it possible "fill" column names with empty values/NAs, when header lack trailing column names. This would assume the missing ones are at the end, cf. argument
fill
ofread.table()
.Background
For a real-world example, please see https://gist.github.com/HenrikBengtsson/dabc383aaa958c0ed49a. The above examples are never ending stories in my life.
The text was updated successfully, but these errors were encountered: