Optimizing a new processor architecture

Concept, photos, videos, examples, construction



Rob Landley http://linux.conf.au/schedule/presentation/29/ When the last patents on the SuperH architecture started expiring in 2014, the http://j-core.org project released a new BSD-licensed clean room VHDL implementation of an SOC compatible with the sh2 instruction set already capable of booting Linux to a shell prompt on a $50 FPGA board. Linux Weekly news covered this at https://lwn.net/Articles/647636/ Now we'd like to talk about the things we've done to speed up linux, gcc, musl-libc, and the VHDL itself since we first got Linux booted on the thing ~3 years ago. We've doubled the MHZ, added SMP support, implemented futexes, ported everything to device tree, tracked down kernel and toolchain bugs of the "how did this ever work" variety (spoiler: it didn't), and even have a native compiler working on the board. We'll explain why we selected this architecture instead of i386/sparc/m68k (whose patents have had just as long to expire), scaling the processor design up to 64 bit and down to Arduino country at the same time, when the best way to go isn't clear because of tradeoffs (with a "prefetch vs cache" example), decisions about compatibility (sh2 vs sh3 system call numbers, should 64 bit mode have branch delay slots), issues with interrupts and clocks and futexes we hit modernizing an older architecture, and so on.

Comments

  1. Glad I'm not the only one with a severe tab problem.
  2. Awesome!
  3. So that's why Pi likes to corrupt the SD cards! Is it possible to fix that somehow?
  4. The presentation text is at http://landley.net/talks/lca-2017.txt
    It would be interesting to have j-core and perhaps Tensilica (ESP32) results to compare with the arXiv:1607.02318
    The Renewed Case for the Reduced Instruction Set Computer: Avoiding ISA Bloat with Macro-Op Fusion for RISC-V
    (comparing the dynamic instruction counts and dynamic instruction bytes fetched for the popular proprietary ARMv7, ARMv8, IA-32, and x86-64 Instruction Set Architectures (ISAs) against the free and open RISC-V RV64G and RV64GC ISAs when running the SPEC CINT2006 benchmark suite)
  5. Great topic, and really interesting, but let down by constant backtracking ("here's a tab that I'm not going to show you. And another one"). I've seen some really great no slides talks before, but with all the sidebars that went nowhere, this talk was really let down. There's about 20 minutes of material here, presented in 50.

    I would suggest Rob focus on a few key topics:

    Baseline performance of the j-core original port
    How they improved the maturity and stability of the port
    How they improved the core (with microbenchmarks or metrics)
    How they improved the tool chain (with microbenchmarks or metrics)
    Current benchmarks and demo of FPGA hardware
    How folks can contribute via emulation and in hardware

    Get rid of all the tabs; they distracted both material and from the speaker. None of the tabs ended up being discussed for any reason, so they were more than terrible. By not backtracking, and maintaining forward momentum of the core story, this talk could have been far more compelling.


Additional Information:

Visibility: 3417

Duration: 48m 6s

Rating: 51