admin管理员组文章数量:1122846
I have a large C code base, with >100 binaries, >3000 files and > 30 libraries. There is a lot of dead code that was accumulated and I'm looking for ways to identify and remove that code. The code is simple - no complex macros and (very little) automatically generated code (lex/bison/...).
To identify "static" dead code (and variables) gcc does a good job (using -Wunused-*
options identifies all unused static variables, static functions, ...). My challenge is with non-static global functions and variables (and the code base has lot of them!)
I've lot of mileage using 'nm' across all the objects files, practically create a list of all defined global symbols (types 'T', 'D' and 'B' for code, data and uninitialized). I then removed every 'U' symbols. That process identified all unreferenced global. At this point, I have to manually make each symbol static, compile with gcc -Werror -Wunused
, and see if it raises any error.
# Omitting some details for brevity.
nm --undefined-only lib1.a lib2.a ... obj1 obj2.o obj3.o | sort > refs.txt
nm --extern-only --defined-only lib1.a lib2.a ... obj1 obj2.o obj3.o | sort > defs.txt
join -12 -23 -v2 refs.txt defs.txt
My question - is it possible to use "nm" (or other object analysis tool like objdump
) to identify which global symbols in object file are also used inside the same object. This will speed up the dead code elimination by separating dead code in global function from global functions that are actually used (but may become static).
Alternatively, is there any other existing tool that will do the job?
I have a large C code base, with >100 binaries, >3000 files and > 30 libraries. There is a lot of dead code that was accumulated and I'm looking for ways to identify and remove that code. The code is simple - no complex macros and (very little) automatically generated code (lex/bison/...).
To identify "static" dead code (and variables) gcc does a good job (using -Wunused-*
options identifies all unused static variables, static functions, ...). My challenge is with non-static global functions and variables (and the code base has lot of them!)
I've lot of mileage using 'nm' across all the objects files, practically create a list of all defined global symbols (types 'T', 'D' and 'B' for code, data and uninitialized). I then removed every 'U' symbols. That process identified all unreferenced global. At this point, I have to manually make each symbol static, compile with gcc -Werror -Wunused
, and see if it raises any error.
# Omitting some details for brevity.
nm --undefined-only lib1.a lib2.a ... obj1 obj2.o obj3.o | sort > refs.txt
nm --extern-only --defined-only lib1.a lib2.a ... obj1 obj2.o obj3.o | sort > defs.txt
join -12 -23 -v2 refs.txt defs.txt
My question - is it possible to use "nm" (or other object analysis tool like objdump
) to identify which global symbols in object file are also used inside the same object. This will speed up the dead code elimination by separating dead code in global function from global functions that are actually used (but may become static).
Alternatively, is there any other existing tool that will do the job?
Share Improve this question edited yesterday mkrieger1 22.9k7 gold badges63 silver badges79 bronze badges asked yesterday dash-odash-o 14.4k1 gold badge13 silver badges40 bronze badges 5 |1 Answer
Reset to default 3I suggest to use GNU ld's dead symbol removal functionality for this.
For this you need to compile your code with -fdata-sections -ffunction-sections
and then link with -Wl,--gc-sections -Wl,--print-gc-sections
flags. It will print information about functions which have been removed.
Here is an example for sample program
/usr/bin/ld: removing unused section '.text.foo' in file '/tmp/ccXZWJ2X.o'
(.text.foo
is section generated for unused function foo
).
As a side note, if you use these options there may be no need to manually sanitize your codebase (apart from making it cleaner) because the toolchain will remove dead code automatically.
本文标签: cIdentifying dead code in large code repositoryStack Overflow
版权声明:本文标题:c - Identifying dead code in large code repository - Stack Overflow 内容由网友自发贡献,该文观点仅代表作者本人, 转载请联系作者并注明出处:http://www.betaflare.com/web/1736281246a1926258.html, 本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容,一经查实,本站将立刻删除。
static
and test. Depending on your use case it may save a lot of time or not. – Weijun Zhou Commented yesterday