I am importing a text file with Swedish letters (å, ä, ö).
A. If I use ReadList["sv_JSPFirefox.txt"]
, it imports the file nicely but then I cannot use a command line like:
dict2 = Select[dict1, Not@StringContainsQ[#, Alternatives @@ rejectlist] &]
The error code is:
StringContainsQ::strse: String or list of strings expected at position 1
in StringContainsQ[Aapua,.|0|1|2|3|4|5|6|7|8|9|{A,B,C,D,E,F,G,H,
I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z}|Å|Ä|Ö]. >>
Two screen dumps show the problem better:
B. If I use: ReadList["sv_JSPFirefox.txt, Word]
it imports the file but those characters are a mess; but at least I can use the command above and proceed; but commands (search, etc) including these characters do work properly.
I have tried to add "UTF8", "UTF-8","Unicode", "Lines" and other options but nothing seems to help. Any pointer in the right direction would be greatly appreciated. Thank you for your time!
20160306 Edit. @C.E. I made a sample file and a very small file called "shrimpsandwich.txt". This is enough because this is a word in Swedish that contains all three letters, "räksmörgås". You might have heard of smörgåsbord.
Sample files: A small sample file is displayed in the screen dump above.
Answer
As you can see from the discussion above which might give some other insights, the answer by @Xavier solves the problem. Thank you all, for taking the time to test and answer.
Solution:
dict1 = ToString /@ ReadList["sv_JSPFirefox.txt"]
Comments
Post a Comment