parsing - Granularity of tokens for lexer - Stack Overflow

IT技术

更新时间：2025-01-089

admin管理员组
文章数量:1122832

I want to build a little lexer and parser by myself. I want the lexer to produce a vector of tokens that I feed into the parser later. Now I think about what belongs into which stage.

Let's look at this input:

xy = 1.23

My token stream could be one of the following - or a mixture of both:

letter letter whitespace eqsign whitespace digit dot digit digit
identifier eqsign decimal

To further process the input, I need (2) of course. But to what extend will the lexer stage do the job? I could also think of 2 consecutive lexer stages in which Lexer1 will produce (1) from String and Lexer2 will produce (2) from List<Lexer1Token>.

Similary, for <b>test</b> in HTML, the tokens might be

lt string gt string lt slash string gt
opentag[type=b] string closingtag[type=b]

I want to build a little lexer and parser by myself. I want the lexer to produce a vector of tokens that I feed into the parser later. Now I think about what belongs into which stage.

Let's look at this input:

xy = 1.23

My token stream could be one of the following - or a mixture of both:

letter letter whitespace eqsign whitespace digit dot digit digit
identifier eqsign decimal

To further process the input, I need (2) of course. But to what extend will the lexer stage do the job? I could also think of 2 consecutive lexer stages in which Lexer1 will produce (1) from String and Lexer2 will produce (2) from List<Lexer1Token>.

Similary, for <b>test</b> in HTML, the tokens might be

lt string gt string lt slash string gt
opentag[type=b] string closingtag[type=b]

Share Improve this question edited Nov 21, 2024 at 18:23 MJane 132 bronze badges asked Nov 21, 2024 at 18:16 MrSnrub 1,1831 gold badge11 silver badges24 bronze badges

Add a comment |

1 Answer 1

Sorted by: Reset to default 1

Obviously it depends if your language (e.g. Your language might need special handling of .``. ) but for most cases you just need version 2, [identifier, equal, decimal] ( I would call it assign).

Let the lexer do as much as possible without getting into the domain of the parser (e.g. decide if the order is valid).

本文标签： parsingGranularity of tokens for lexerStack Overflow

版权声明：本文标题：parsing - Granularity of tokens for lexer - Stack Overflow 内容由网友自发贡献，该文观点仅代表作者本人，转载请联系作者并注明出处：http://www.betaflare.com/web/1736308123a1933551.html，本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容，一经查实，本站将立刻删除。

编程频道|软件玩家 - 软件改变生活！

parsing - Granularity of tokens for lexer - Stack Overflow

1 Answer 1

更多相关文章