Regex Split String On Specific Chars Outside Quotes
Solution 1:
Don't use split()
, then it's easy:
result = subject.match(/[>#.[{](?:"[^"]*"|[^">#.[{])+/g);
See it live on regex101.com.
Explanation:
[>#.[{] # Match a "splitting" character
(?: # Start of group to match either...
"[^"]*" # a quoted string
| # or
[^">#.[{] # any character except quotes and "splitting" characters
)+ # Repeat at least once.
Solution 2:
It's hard coming with a solution using only one regex.
I can propose this :
var i=0, s= '>div#a.more.style.ui[url="in.tray"]{value}';
var tokens = s.replace(/("[^"]+"|[^"\s]+)/g, function(v){
return (i++)%2 ? v : v.replace(/([.>#\[{])/g, '@@@$1')}
).split('@@@').filter(Boolean);
(replace @@@
with a string you know isn't in your string.
The idea is to
- split the initial string into strings out of quotes and strings in quotes (alternatively, and the latter ones with their quotes) (not a real split, just a conceptual one)
- outside of the quotes, add @@@ before the separator
- split on @@@ the joined string
- remove the (potential) empty strings using
filter
Solution 3:
I do wonder if Regex is really the way to go in this case. I know this was tagged as regex
, but I'd like to share a non-Regex solution which simply processes each character:
varstring = '>div#a.more.style.ui[url="in.tray"]{value}'var delims = [ '>', '#', '.', '[', '{' ];
var inQuotes = false;
var parts = [];
var part = string[0]; // Start with first characterfor(i = 1; i < string.length; i++) {
var character = string[i];
if(character == '"') inQuotes = !inQuotes;
if(!inQuotes && delims.indexOf(character) > -1) {
parts.push(part);
part = character;
} else part += character;
if(i == string.length-1) parts.push(part);
}
console.log(parts);
Output:
[ '>div',
'#a',
'.more',
'.style',
'.ui',
'[url="in.tray"]',
'{value}' ]
The inQuotes
business will not work for escaped quotes within quotes, i.e., "He said, \"hi there!\""
, but for simple cases like this it will work. You can extend it to check if the quote is an escaped quote inside a quote by comparing the previous character to "\" and checking if isQuotes
is currently true
I suppose, but there are probably better solutions to that.
In terms of readability I think an approach like this is preferred over Regex, though.
Post a Comment for "Regex Split String On Specific Chars Outside Quotes"