Q6) Is there a concept of port? - Kind of. We can use aggregates as function arguments. Q5) Is it possible to make multi-cycle operations? - Like multi-cycle multiplies and divides? I do have a few designs for those built using CHDL (you'd rarely want to use the CHDL built-in divide) but they're not in the standard library. Q7) How to understand the printed message to understand critical path. - Not mentioned yet, what "critpath" means. Maybe do a tutorial on performance optimization. Q8) How to set up a clock cycle? - As in a clock domain? (not relevant to this discussion) - As in a cycle time? (not currently relevant; we use clock cycles as the basic unit of time in the simulator; when we include it with SST, we specify the cycle time in the component configuration) Q1) how to do memory read and write operations - On SRAM? Covered in today's slides; using Syncmem. - On external memory? That depends on the interface of the particular external memory. Q2) How to configure memory (address bits and data size)? - Based on the size of the inputs to the syncmem (template parameters). Q3) Is there any way to display memory values? - Not yet. Working on it as part of a general memory API refactoring. Q9) SRAM and DRAM// - All internal CHDL RAM is SRAM. All RAM (that we use) is synchronous. DRAM is always off-chip, accessed through I/O. (1) when to use nodes vs. bit vectors and how to convert them each other - Nodes are just single bits. Bit vectors are just vectors of nodes. - To get a node from a bvec, you select one using either a Mux (dynamic) or [] (static). - To get a bvec<1> from a node n, use bvec<1> x(n). Passing a node to a bvec's constructor initializes every element in that bvec to the given node. - To do all-reduce Ands, Ors, and Xors of a bvec x, use AndN(x), OrN(x), or XorN(x) respectively. (2) How to add prefix or suffix to TAP variables. - Use the lower-case tap function instead of the macro: tap("prefix_x", x); (3) The usage of hierarchy - Hierarchy is useful in debugging (especially performance debugging). - Just call HIERARCHY_ENTER() at the start of every function and HIERARCHY_EXIT() at the end (just prior to return). All of the nodes will be inserted into an appropriate place in the call graph. This lets you see just where each step on your critical path was defined. (4) how to increase the data width from the demo code? I faced a memory initialization problem. - You should just be able to increase the value in "WORD_SZ". - Oh. But that makes the LLRom for the instruction memory prohibitively large. Let's truncate that down to IMEM_SZ: // Instruction memory; we use a Harvard architecture. inst_t InstMem(const word_t &a) { return LLRom(Zext(a), "demo.hex"); } - Now this can scale to arbitrary word sizes without crashing. This was tried with 27-bit words. (5) How to initialize register values. Is there a concept of class and instantiation? - Register initialization is pretty shaky currently. You can pass numbers to bvec registers and pass booleans to ordinary registers, but that's about it. Q4) Pipeline latch designs (6) Pipeline latches, I'd like to use the same structure/functions but the latch sizes care different for different pipeline stages. How can we do that? - Using aggregate types. Then your pipeline register becomes either the stock Wreg (next_stage = Wreg(!stall, this_stage) or a user-defined template PipelineReg function with support for instruction cancelling, etc. : template T PipelineReg(node clear, node stall, const T &in) { HIERARCHY_ENTER(); bvec d(Flatten(in)), q, zero(Lit(0)); q = Wreg(!stall || clear, Mux(clear, d, zero)); HIERARCHY_EXIT(); return q; } Note that these are full edge-triggered registers, not level-triggered latches. Latch-based pipelining is a wonderful idea, but the dual assumptions that CHDL thrives on are that it is a simple retiming problem to split pipeline registers into pairs of pipeline latches and that designers prefer to think in terms of clock cycles during the RTL phase.