admin管理员组文章数量:1352005
Update: Currently, a working solution has been achieved with multiple regular expressions used simultaneously
I'm looking for a regular expression that covers the range of acceptable syntax for ISO 8601 dates. Most libraries and regex's only cover a subset or a "simplified" version of 8601, such as RFC 3339, but the full 8601 includes ways to express time durations and intervals. The wikipedia entry for ISO 8601 provides a good overview, but in short these should all be valid dates:
var testDates = {
'year' : "2013",
'date' : "2013-01-05",
'datetime' : "2013-01-05T04:13:00+00:00",
'Duration only' : "P1Y2M10DT2H30M",
'Week Duration' : "P1W",
'Range with start and end' : "2007-03-01T13:00:00Z/2008-05-11T15:30:00Z",
'Range of Date/Duration' : "2007-03-01T13:00:00Z/P1Y2M10DT2H30M",
'Range of Date/Duration 1 month' : "2012-10/P1M",
'Range of Date/Duration 1 week' : "2012-10/P1W",
'Range of Duration/Date' : "P1Y2M10DT2H30M/2007-03-01T13:00:00Z",
'Repeating interval 5 times' : "R5/2007-03-01T13:00:00Z/P1Y2M10DT2H30M",
'Repeating interval weekly indefinitely' : "R/2012-10/P1W",
'Repeating interval monthly 5 times' : "R5/2012-10/P1M"
}
I'm trying to build a regular expression that will cover all these possibilities. I've pieced a few ponents together, but am not skilled enough with regular expressions to make them all work as one expression and cover all possible cases. Just doing a simple "OR" between these doesn't seem to work as expected, but perhaps I'm not doing it quite right.
The expressions I've tried include these:
var regex = {
'Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Duration' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Date/Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Range of Date/Duration' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Duration/Date' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Repeating interval' : /^R\d*\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?\/P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/
}
I've put together an example script that tries these expressions on each of the dates listed above, but if anyone could provide guidance on bining all these cases into one expression, it would be much appreciated.
The use case that's driving this is to validate an ISO 8601 Date for a JSON Schema document. JSON Schema does provide some flexibility in how something is validated in that you can provide multiple rules or regular expressions to test. In this case, I could solve my problem by using multiple regular expressions rather than one bined expression.
Update: the problem has been addressed so far by using this approach (multiple separate expressions used simultaneously)
Here's the script:
- As a gist:
- As a webpage: /
Update: Currently, a working solution has been achieved with multiple regular expressions used simultaneously
I'm looking for a regular expression that covers the range of acceptable syntax for ISO 8601 dates. Most libraries and regex's only cover a subset or a "simplified" version of 8601, such as RFC 3339, but the full 8601 includes ways to express time durations and intervals. The wikipedia entry for ISO 8601 provides a good overview, but in short these should all be valid dates:
var testDates = {
'year' : "2013",
'date' : "2013-01-05",
'datetime' : "2013-01-05T04:13:00+00:00",
'Duration only' : "P1Y2M10DT2H30M",
'Week Duration' : "P1W",
'Range with start and end' : "2007-03-01T13:00:00Z/2008-05-11T15:30:00Z",
'Range of Date/Duration' : "2007-03-01T13:00:00Z/P1Y2M10DT2H30M",
'Range of Date/Duration 1 month' : "2012-10/P1M",
'Range of Date/Duration 1 week' : "2012-10/P1W",
'Range of Duration/Date' : "P1Y2M10DT2H30M/2007-03-01T13:00:00Z",
'Repeating interval 5 times' : "R5/2007-03-01T13:00:00Z/P1Y2M10DT2H30M",
'Repeating interval weekly indefinitely' : "R/2012-10/P1W",
'Repeating interval monthly 5 times' : "R5/2012-10/P1M"
}
I'm trying to build a regular expression that will cover all these possibilities. I've pieced a few ponents together, but am not skilled enough with regular expressions to make them all work as one expression and cover all possible cases. Just doing a simple "OR" between these doesn't seem to work as expected, but perhaps I'm not doing it quite right.
The expressions I've tried include these:
var regex = {
'Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Duration' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Date/Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Range of Date/Duration' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Duration/Date' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Repeating interval' : /^R\d*\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?\/P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/
}
I've put together an example script that tries these expressions on each of the dates listed above, but if anyone could provide guidance on bining all these cases into one expression, it would be much appreciated.
The use case that's driving this is to validate an ISO 8601 Date for a JSON Schema document. JSON Schema does provide some flexibility in how something is validated in that you can provide multiple rules or regular expressions to test. In this case, I could solve my problem by using multiple regular expressions rather than one bined expression.
Update: the problem has been addressed so far by using this approach (multiple separate expressions used simultaneously)
Here's the script:
- As a gist: https://gist.github./philipashlock/8830168
- As a webpage: http://bl.ocks/philipashlock/raw/8830168/
- Interesting question and I'll be curious to see your final answer. However, I would strongly suggest writing these (non-trivial) regexes in free spacing mode with indentation and ments so that it is readable by mere mortals (even though JavaScript does not support the x-modifier). With JS's no-x-mode-limitation, I like to include the verbose regex inside a multi-line ment immediately preceeding the actual RegExp literal. – ridgerunner Commented Feb 11, 2014 at 7:23
- I've updated the description to note that all of these expressions are just different binations of two expressions (one for date and one for duration) and there's only one bination that does not currently work. I've also updated the gist with references to the source of those two expressions where there's some additional explanation of how they were constructed. – Philip Ashlock Commented Feb 14, 2014 at 22:43
3 Answers
Reset to default 5Here's the full set of expressions I've found to achieve the variety of ISO 8601 syntax. The gist has been updated with these and additional tests for repeating intervals.
var regex = {
'Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Duration' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Date/Date' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Range of Date/Duration' : /^([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?(\/)P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/,
'Range of Duration/Date' : /^P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?$/,
'Repeating interval' : /^R\d*\/([\+-]?\d{4}(?!\d{2}\b))((-?)((0[1-9]|1[0-2])(\3([12]\d|0[1-9]|3[01]))?|W([0-4]\d|5[0-2])(-?[1-7])?|(00[1-9]|0[1-9]\d|[12]\d{2}|3([0-5]\d|6[1-6])))([T\s]((([01]\d|2[0-3])((:?)[0-5]\d)?|24\:?00)([\.,]\d+(?!:))?)?(\17[0-5]\d([\.,]\d+)?)?([zZ]|([\+-])([01]\d|2[0-3]):?([0-5]\d)?)?)?)?\/P(?=\w*\d)(?:\d+Y|Y)?(?:\d+M|M)?(?:\d+W|W)?(?:\d+D|D)?(?:T(?:\d+H|H)?(?:\d+M|M)?(?:\d+(?:\.\d{1,2})?S|S)?)?$/
}
Is a regex the best/most appropriate solution for this?
It appears that you're really trying to parse a string according to the spec. Just look at how plex some of those regexes are that you've tried.
My suggestion would be either to make simpler regexes that match each pattern mixed in with some if/then code or to use/create a full on parser.
Remember that regexes were designed for pattern matching, not parsing. They're great for pulling out a quick value from a string but not for more plex things.
Allow me to question the full approach. Dates, instants, and ranges are different concepts, accepting them interchangeably (what I understand from wanting a single regular expression for all of them) will lead to user, programmer, or (even worse) both error.
If the allowed formats are too plex, somebody will make mistakes. Can't your application work with a subset?
本文标签: javascriptRegular expression for full ISO 8601 date syntaxStack Overflow
版权声明:本文标题:javascript - Regular expression for full ISO 8601 date syntax - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1743906910a2559690.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
发表评论