What to submit
- demonstration of functionality during office hours
- hard copies of
- lab writeup
- sinegen_proc.s55
- email Yusheng (subject line: lab2abc submit) soft copies of
- sinegen_proc.s55
- userlinker.cmd
Goals
- Correct functionality
- 176-180 cycles for sinegen_proc() will get you full credit for the "execution cycles" component of your grade. Extra credit is possible!
Extra credit for improving to 1 cycle per sample. Hint: you are allowed to manipulate the size
of the sine1 buffer.
Tips
To set a loop iteration counter like CSR or BRC0, you should set a temporary register like T0 to be the correct count and then set CSR or BRC0 to the temp. You are not allowed to perform arithmetic on the loop iteration counter register itself.
For sine wave generation, coefSect is not properly mapped in the linker
command file (userlinker.cmd). You make a mapping of coefSect in userlinker.cmd.
There are two choices: SARAM and DARAM (not SARAM1 and DARAM1). Think about which choice is better.
Select which sine table you will use, and copy
that table to sine1 in
sinegen_proc_init. Then your sinegen_proc function can just use data
in sine1.
In practical cases, you would have tables in
external memory, which
is very slow. Then accessing that table is really slow, so it's better
copy the table (just the one we want to use, because internal memory
is restricted in size) to internal memory before using it. Our case is
just to emulate this situation.
sine1 needs to be a circularly-addressed buffer. The sinegen_proc function should copy the values of sine1 to the dst buffer and circles sine1 until the dst buffer is full.
You should generate sine wave to both the channels; left and right.
Update the value of inx at the end of sinegen_proc so that the next time this function is called, it will know to resume where it left off.
When you set a circular buffer pointer to be an index, the pointer takes on the value of the index, not the value of the origin + index. This is normal.
To access C variables from your .s55 file, you need to add a .global declaration in the .h55 file. This allocates a place in DSP memory that points to the address of the C variable. For example,
.global _dst
goes in the .h55 file. _dst (in assembly) is equal to the address of dst (in C). To actually access the start of the xmtBuffer from your .s55 file, you need to put parentheses around _dst and then dereference it. In general, you need to put parens around a variable before you dereference it. So to complete the example,
*(_dst)
is the last 16 bits of the address of the beginning of the xmtBuffer. You will need to load the entire address into an XAR register...
The destination buffer is stored in _dst, so you
should read the value
stored at _dst and set this value to XAR register in the assembly function.
Since _dst is a pointer, it's a 23bit data, so C55x reserves 2 words
(32bits) for storing this data. Look for the Mnemonic instruction reference
manual, MOV instructions. There's a MOV instruction for loading the values
to XAR register.
After BSET-ing a bit, you should BCLR it when you're done with it. This is because the bit may be used by something else that is counting on it to keep its original value.
The idea behind the answer to writeup question #2 should be used in your code. Use AMOV to specify the circular buffer start address properly.
Registers for circular addressing (BSAxx and BKxx)
can be initialized
just one time in _sinegen_proc_init, or they could be set every time
_sinegen_proc is called. I think even if you initialize those registers in
_sinegen_proc_init, there's no problem, since DSP/BIOS programs wouldn't try
to modify those registers. However, page 6-11 of Optimizing C/C++ compiler
says that BKxx and BSAxx are used by compiled code, which means that we
can't guarantee that they are not being used by the C code.
Thus, you should initialize those registers every time _sinegen_proc is
called.
You will check the cycles for two functions;
sinegen_proc_init and
sinegen_proc. This subroutine
calling and returning involves program counter (PC)
discontinuity. Thus, aligning the start address to a 4-byte boundary reduces the
cycle overhead of instruction buffer queue (IBQ) startup. Thus, before the labels
sinegen_proc_init: and sinegen_proc:, add " .align 4" directive.
You will be graded on code structure and development, so avoid using magic constant values in your code, if possible.
For the profiling of sinegen, just check the
cycles in your C program.
That is, reset clock before calling asm function, step <F10> in C,
then read the cycles. Don't use NOPs.... (even if we get just an
approximated cycles, the difference would be very small compared to
the cycles...)
The way we did it in LAB1 is the most precise way to measure the
cycles, but it was just for educational purposes. Once we have a complex
function, consuming more than hundred cycles, the one or two cycle
difference with the precise answer obtained from NOPs could be ignored.
If you check the performance with this way, sometimes the cycle is
more than 1500, in which you expect around 200. That's because it's a
real-time system.
During the execution of your assembly function, there could be
interrupts, then program execution goes to ISR, then do some
processing inside DSP/BIOS. Those 1500 cycles include all that work.
To correctly measure the cycle, use these two functions;
_disable_interrupts();
_enable_interrupts();
These are predefined function in C55x C compiler so that you can
easily disable / enable interrupts in C. Disable interrupts just before
your function, and enable interrupts after your function. Put breakpoints at
sinegen_proc(); and _enable_interrupts(). Measure the cycles between these
breakpoints. Then your cycle measurement would be reasonable.
This disabling interrupts is just for profiling. Remove these function calls
after profiling.
Subroutine calling overhead is roughly 15
cycles; 6 cycles for
1 or 2 clocks for IBQ start overhead both for call and ret. So, it is expected that
the measured clock cycles is different from the
hand calculation by about 15 cycles.
Writeup tips
code size: size(total) - size(total - function) = size(function), where total - function is obtained by a function with an empty body (other than the RET instruction).
data size: report the size of the data that each function uses. So for sinegen_proc_init(), it would be size(coefSect) + size(coefDat)/2. For sinegen_proc(), it would be size(xmtBuffer)+size(coefSect)+size(inx), where size(xmtBuffer) = 192.
hand calculations: include the RET
EVM profile: use the disable/enable interrupts method described above.