Binary rewriting

A binary recompiler is a compiler that takes executable binary files as input, analyzes their structure, applies transformations and optimizations, and outputs new optimized executable binaries.[1]

The foundation to the concepts of binary recompilation were laid out by Gary Kildall[2][3][4][5][6][7][8] with the development of the optimizing assembly code translator XLT86 in 1981.[4][9][10][11]

See also

References

  1. ^ Mudge, Trevor; Reinhardt, Steve; Tyson, Gary. "Binary Recompilation and Combined Compiler/Architecture Enhancements Studies". umich.edu. University of Michigan (UM). Archived from the original on 2012-07-23. Retrieved 2012-07-23.
  2. ^ Kildall, Gary Arlen (May 1972). Global expression optimization during compilation (Ph.D. dissertation). Seattle, Washington, USA: University of Washington, Computer Science Group. Thesis No. 20506, Technical Report No. 72-06-02.
  3. ^ Kildall, Gary Arlen (1973-10-01). "A Unified Approach to Global Program Optimization" (PDF). Proceedings of the 1st Annual ACM SIGACT-SIGPLAN Symposium on Principles of Programming Languages (POPL). POPL '73. Boston, Massachusetts, USA: 194–206. doi:10.1145/512927.512945. hdl:10945/42162. S2CID 10219496. Archived (PDF) from the original on 2017-06-29. Retrieved 2006-11-20. ([1])
  4. ^ a b Freiberger, Paul (1981-10-19). "Program translators do it literally - and sometimes in context". InfoWorld - News For Microcomputer Users. Special section: Computer compatibility. Vol. 3, no. 22. Popular Computing, Inc. p. 19. ISSN 0199-6649. Retrieved 2020-01-15. […] "Unless you have a translating scheme that takes account of the peculiar idiosyncrasies of the target microprocessor, there is no way that an automatic translator can work," explains Daniel Davis, a programmer with Digital Research. "You'll end up with direct transliterations." […] In spite of all these limitations, progress has been made recently in the development of translators. Most notably, Digital Research has introduced its eight- to 16-bit assembly code translator. Based on research performed by Digital Research president Gary Kildall, the XLT86 appears to offer advances over previously available software translator technology. Like Sorcim's Trans and Intel's Convert 86, Kildall's package translates assembly-language code from an 8080 microprocessor to an 8086. However, Kildall has applied a global flow analysis technique that takes into account some of the major drawbacks of other translators. The procedure analyzes the register and flag usage in sections of 8080 code in order to eliminate nonessential code. According to Digital Research programmer Davis, the algorithm Kildall uses allows the translator to consider the context as it translates the program. Until now, one of the major problems with any translator program has been the inability of the software to do much more than transliteration. If Digital Research's new translator actually advances the technology to the point where context can be considered, then more software translators may proliferate in the microcomputer marketplace.
  5. ^ Wharton, John Harrison (1994-08-01). "Gary Kildall, industry pioneer, dead at 52: created first microcomputer languages, disk operating systems". Microprocessor Report. 8 (10). MicroDesign Resources Inc. (MDR). Archived from the original on 2016-11-18. Retrieved 2016-11-18.
  6. ^ "SPA Award to Dr. Gary A. Kildall: 1995 SPA Lifetime Achievement Award Winner". Software Publishers Association (SPA). 1995-03-13. Archived from the original on 2019-12-21. Retrieved 2019-12-21 – via www.digitalresearch.biz.
  7. ^ Swaine, Michael (1997-04-01). "Gary Kildall and Collegial Entrepreneurship". Dr. Dobb's Journal. Archived from the original on 2007-01-24. Retrieved 2006-11-20. In March, 1995, the Software Publishers Association posthumously honored Gary for his contributions to the computer industry. They listed some of his accomplishments: […] In the 1980s, through DRI, he introduced a binary recompiler. […]
  8. ^ Huitt, Robert; Eubanks, Gordon; Rolander, Thomas "Tom" Alan; Laws, David; Michel, Howard E.; Halla, Brian; Wharton, John Harrison; Berg, Brian; Su, Weilian; Kildall, Scott; Kampe, Bill (2014-04-25). Laws, David (ed.). "Legacy of Gary Kildall: The CP/M IEEE Milestone Dedication" (PDF) (video transscription). Pacific Grove, California, USA: Computer History Museum. CHM Reference number: X7170.2014. Retrieved 2020-01-19. […] Rolander: I mentioned earlier that Gary liked to approach a problem as an architect. […] And he would draw the most beautiful pictures of his data structures. […] And when he finished that […] and was convinced those data structures were now correct, he would go into just an unbelievable manic coding mode. He would just go for as many as 20 hours a day […] he was just gone during these periods of time. On a couple of those occasions, when he'd get something running the first time, which could be in the middle of night. And all you who have written software have seen that, for example, that the first time it comes up on the screen, you've got to tell somebody. My wife Lori will tell you that I had a couple of those calls in the middle of the night, LOGO was one example, XLT 86 was another, where he got it running the first time, and he had to have somebody see it. So it didn't matter what time it was, he'd call me, I'd have to come over and see it running. […] [2][3] (33 pages)
  9. ^ Barry, Tim (1982-04-05). "XLT-86, a CP/M utility program by Digital Research". InfoWorld - The Newsweekly for Microcomputer Users. InfoWorld Software Review. Vol. 4, no. 13. Popular Computing, Inc. pp. 40–41, 53. ISSN 0199-6649. Retrieved 2020-01-25. […] XLT-86 is an analytical translator program written in PL/I-80. It reads the entire 8080 source program, assembles it to machine code, analyzes the register, memory and flag utilization, and emits an optimized 8086 assembly-language program. […] The program translation proceeds in a five-step process. First, the program is scanned and assembled to produce symbol values and locations. Second, the program structure is analyzed and decomposed into basic blocks. Third, the basic blocks are analyzed to determine program flow and resource usage. Forth, the block structure and register allocation data is gathered into a listing for the user. Fifth, the flow information and source program are used to produce the 8086 source program. […]
  10. ^ Kildall, Gary Arlen (1982-04-19). Swaine, Michael; Freiberger, Paul; Markoff, John Gregory (eds.). "Digital Research founder discusses his view of the business". InfoWorld - The Newsweekly for Microcomputer Users. Special section: CP/M. Vol. 4, no. 15. Popular Computing, Inc. pp. 23–24. ISSN 0199-6649. Retrieved 2020-01-17. […] Kildall: […] A year and a half ago I was probably spending 75% of my time on the business and 25% on programming. XLT-86 was a product I was working on at that time, and it took me nine months to do it. That would have been a three-month project if I had been able to concentrate on it. […]
  11. ^ Kildall, Gary Arlen (June–July 1982). Bunnell, David Hugh; Edlin, Jim (eds.). "Gary Kildall - The Man Who Created CP/M: CP/M's Creator - An Indepth PC-Exclusive Interview with Software Pioneer Gary Kildall". PC Magazine. Operating Systems. Vol. 1, no. 3. Software Communications, Inc. pp. 32–38, 40. Retrieved 2020-01-17. […] PC: What are some of the complexities involved in translating a program from 8080 to 8086 form? Kildall: Straight translations at the source program level you can do pretty much mechanically. For example, an 8080 "Add immediate 5" instruction turns into an "Add AL 5" on the 8086 — very straightforward translation of the op codes themselves. The complexity in mechanical translation comes from situations such as this: The 8080 instruction DAD H takes the HL register and adds DE to it. For the 8086 the equivalent instruction would be something like ADD DX BX, which is fine, no particular problem. You just say the DX register is the same as HL and BX the same as DE. The problem is that the 8086 instruction has a side effect of setting the zero flag, and the 8080 instruction does not. In mechanical translation you end up doing something like saving the flags, restoring the flags, doing some shifts and rotates, and so forth. These add about five or six extra instructions to get the same semantic effect. There are a lot of sequences in 8080 code that produce very strange sequences in 8086 code; they just don't map very well because of flag registers and things of that sort. The way we get software over is a thing called XLT-86. It's been out six months or so. PC: By "better" code do you mean smaller? Kildall: Twenty percent smaller than if you just took every op code and did a straight translation, saving the registers to preserve semantics. PC: How does the size of the translated program compare to the 8080 version? Kildall: If you take an 8080 program, move it over to 86 land and do an XLT-86 translation, you'll find that it is roughly 10 to 20 percent larger. With 16-bit machines it's more difficult to address everything; you get op codes that are a little bit bigger on the average. An interesting phenomenon is that one of the reasons you don't get a tremendous speed increase in the 16-bit world is because you're running more op codes over the data bus. […]

Further reading