BATCH 4 | Project Stage 3, Part 2 - Test Cases and Results

 Project Stage 3, Part 2

Test Cases and Results

Hi everyone! Welcome to Part 2 of my Project Stage 3. In this post, I’ll walk you through the results of my test cases and demonstrate that my clone-prune pass works successfully across multiple functions. Let’s start by navigating to the SPO600 clone test files directory!

Let's Dive In!

STEP 1: Let’s start by navigating to the SPO600 clone test files directory by using this command cd spo600/examples/test-clone/. After using make command, this was the output in my terminal. Making sure cloned functions are detected. I also ensured to output resolver functions that are being skipped and finally only comparing non-resolver functions.



STEP 2: Since the updates were made, I wanted to ensure it works by taking a look at the test case file and see if diagnostic messages of whether to PRUNE the function or not works as intended. Down below are screenshots of the PRUNE and NOPRUNE messages for the functions detected as clones. To do so, I used this command to search for these outputs in a dumpfile, less clone-test-aarch64-prune-clone-test-core.c.270t.count.










STEP 3: Now that PRUNE and NOPRUNE successfully displays for functions that were detected to be clones, let's take a look at the object dump files and compare if the non-resolver functions are actually substantially the same for PRUNED functions and otherwise, NOPRUNE should have different implementations.

Down below is a screenshot of the object dump files which were displayed by using the objdump -d command and | grep '>:' to output the necessary files; if you notice there are added files for prune_funcs and noprune_funcs as an indicator of whether or not these functions should be pruned. These functions should contain non-resolver files that are meant to be compared.



STEP 4: Let's investigate each function and do the comparison to make sure pruned functions are identical and noprune functions are different in implementation. First let's start by accessing three pruned functions by using objdump -d command and | less to display the content of the file and to allow us to compare rng and default functions:

1. PRUNE_FUNC4


0000000000400e44 <prune_func4.default>:

  400e44:       7100c41f        cmp     w0, #0x31

  400e48:       1101a401        add     w1, w0, #0x69

  400e4c:       11017c00        add     w0, w0, #0x5f

  400e50:       1a81c000        csel    w0, w0, w1, gt

  400e54:       d65f03c0        ret

  400e58:       d503201f        nop

  400e5c:       d503201f        nop


0000000000400e60 <prune_func4._Msve2>:

  400e60:       7100c41f        cmp     w0, #0x31

  400e64:       1101a401        add     w1, w0, #0x69

  400e68:       11017c00        add     w0, w0, #0x5f

  400e6c:       1a81c000        csel    w0, w0, w1, gt

  400e70:       d65f03c0        ret

  400e74:       d503201f        nop

  400e78:       d503201f        nop

  400e7c:       d503201f        nop


2. SCALE_SAMPLES

0000000000400aa0 <scale_samples.default>:

  400aa0:       7100005f        cmp     w2, #0x0

  400aa4:       5400090d        b.le    400bc4 <scale_samples.default+0x124>

  400aa8:       53114064        lsl     w4, w3, #15

  400aac:       5290a3e5        mov     w5, #0x851f                     // #34079

  400ab0:       4b030084        sub     w4, w4, w3

  400ab4:       72aa3d65        movk    w5, #0x51eb, lsl #16

  400ab8:       51000447        sub     w7, w2, #0x1

  400abc:       2a0203e6        mov     w6, w2

  400ac0:       9b257c85        smull   x5, w4, w5

  400ac4:       9365fca5        asr     x5, x5, #37

  400ac8:       4b847ca4        sub     w4, w5, w4, asr #31

  400acc:       531f7884        lsl     w4, w4, #1

  400ad0:       710008ff        cmp     w7, #0x2

  400ad4:       540007a9        b.ls    400bc8 <scale_samples.default+0x128>  // b.plast

  400ad8:       91000803        add     x3, x0, #0x2

  400adc:       cb030023        sub     x3, x1, x3

  400ae0:       f100307f        cmp     x3, #0xc

  400ae4:       54000729        b.ls    400bc8 <scale_samples.default+0x128>  // b.plast

  400ae8:       710018ff        cmp     w7, #0x6

  400aec:       54000829        b.ls    400bf0 <scale_samples.default+0x150>  // b.plast


    ...


0000000000400fc4 <scale_samples._Mrng>:

  400fc4:       7100005f        cmp     w2, #0x0

  400fc8:       5400090d        b.le    4010e8 <scale_samples._Mrng+0x124>

  400fcc:       53114064        lsl     w4, w3, #15

  400fd0:       5290a3e5        mov     w5, #0x851f                     // #34079

  400fd4:       4b030084        sub     w4, w4, w3

  400fd8:       72aa3d65        movk    w5, #0x51eb, lsl #16

  400fdc:       51000447        sub     w7, w2, #0x1

  400fe0:       2a0203e6        mov     w6, w2

  400fe4:       9b257c85        smull   x5, w4, w5

  400fe8:       9365fca5        asr     x5, x5, #37

  400fec:       4b847ca4        sub     w4, w5, w4, asr #31

  400ff0:       531f7884        lsl     w4, w4, #1

  400ff4:       710008ff        cmp     w7, #0x2

  400ff8:       540007a9        b.ls    4010ec <scale_samples._Mrng+0x128>  // b.plast

  400ffc:       91000803        add     x3, x0, #0x2

  401000:       cb030023        sub     x3, x1, x3

  401004:       f100307f        cmp     x3, #0xc

  401008:       54000729        b.ls    4010ec <scale_samples._Mrng+0x128>  // b.plast

  40100c:       710018ff        cmp     w7, #0x6

  401010:       54000889        b.ls    401120 <scale_samples._Mrng+0x15c>  // b.plast

  401014:       53037c45        lsr     w5, w2, #3

  401018:       4e040c9f        dup     v31.4s, w4

  40101c:       d2800003        mov     x3, #0x0                        // #0

  401020:       d37ceca5        lsl     x5, x5, #4

  401024:       3ce36800        ldr     q0, [x0, x3]

  401028:       0f10a41d        sxtl    v29.4s, v0.4h

  40102c:       4f10a400        sxtl2   v0.4s, v0.8h

    

    ...


2. PRUNE_FUNC3

0000000000400ca4 <prune_func3.default>:

  400ca4:       11019000        add     w0, w0, #0x64

  400ca8:       d65f03c0        ret


0000000000400f20 <prune_func3._Mrng>:

  400f20:       11019000        add     w0, w0, #0x64

  400f24:       d65f03c0        ret


Let's take a look at NOPRUNE functions and do the comparison:

1. NOPRUNE_FUNC1

0000000000400cac <noprune_func1.default>:

  400cac:       7100003f        cmp     w1, #0x0

  400cb0:       5400044d        b.le    400d38 <noprune_func1.default+0x8c>

  400cb4:       51000422        sub     w2, w1, #0x1

  400cb8:       7100085f        cmp     w2, #0x2

  400cbc:       540005a9        b.ls    400d70 <noprune_func1.default+0xc4>  // b.plast

  400cc0:       b0000004        adrp    x4, 401000 <scale_samples._Mrng+0x3c>

  400cc4:       53027c23        lsr     w3, w1, #2

  400cc8:       4f00043e        movi    v30.4s, #0x1

  400ccc:       aa0003e2        mov     x2, x0

  400cd0:       3dc2309f        ldr     q31, [x4, #2240]

  400cd4:       8b235003        add     x3, x0, w3, uxtw #4

  400cd8:       4f00049d        movi    v29.4s, #0x4

  400cdc:       d503201f        nop

  400ce0:       3dc00040        ldr     q0, [x2]

  400ce4:       4e3e1ffa        and     v26.16b, v31.16b, v30.16b

  400ce8:       4ebd87ff        add     v31.4s, v31.4s, v29.4s

  400cec:       4ea0841b        add     v27.4s, v0.4s, v0.4s

  400cf0:       4ea09b5a        cmeq    v26.4s, v26.4s, #0

  400cf4:       4ea0877b        add     v27.4s, v27.4s, v0.4s

  400cf8:       4ebe877c        add     v28.4s, v27.4s, v30.4s

  400cfc:       6eba1f7c        bit     v28.16b, v27.16b, v26.16b

  400d00:       3c81045c        str     q28, [x2], #16

  400d04:       eb03005f        cmp     x2, x3

  400d08:       54fffec1        b.ne    400ce0 <noprune_func1.default+0x34>  // b.any

  400d0c:       121e7422        and     w2, w1, #0xfffffffc

  400d10:       6b02003f        cmp     w1, w2

  400d14:       54000120        b.eq    400d38 <noprune_func1.default+0x8c>  // b.none

  400d18:       d37e7c43        ubfiz   x3, x2, #2, #32

  400d1c:       8b030006        add     x6, x0, x3

  400d20:       b8636804        ldr     w4, [x0, x3]

  400d24:       0b040484        add     w4, w4, w4, lsl #1

    ...


0000000000400ec8 <noprune_func1._Msve2>:

  400ec8:       25a10fe7        whilelo p7.s, wzr, w1

  400ecc:       d2800002        mov     x2, #0x0                        // #0

  400ed0:       04a0e3e3        cntw    x3

  400ed4:       04a1401f        index   z31.s, #0, #1

  400ed8:       25b8c03e        mov     z30.s, #1

  400edc:       2518e3e5        ptrue   p5.b

  400ee0:       7100003f        cmp     w1, #0x0

  400ee4:       5400018d        b.le    400f14 <noprune_func1._Msve2+0x4c>

  400ee8:       a5425c1d        ld1w    {z29.s}, p7/z, [x0, x2, lsl #2]

  400eec:       0420bffc        movprfx z28, z31

  400ef0:       0580001c        and     z28.s, z28.s, #0x1

  400ef4:       04bda7bd        adr     z29.s, [z29.s, z29.s, lsl #1]

  400ef8:       25809796        cmpne   p6.s, p5/z, z28.s, #0

  400efc:       04801bdd        add     z29.s, p6/m, z29.s, z30.s

  400f00:       e5425c1d        st1w    {z29.s}, p7, [x0, x2, lsl #2]

  400f04:       04b0c3ff        incw    z31.s

  400f08:       8b030042        add     x2, x2, x3

  400f0c:       25a10c47        whilelo p7.s, w2, w1

  400f10:       54fffec1        b.ne    400ee8 <noprune_func1._Msve2+0x20>  // b.any

  400f14:       d65f03c0        ret

  400f18:       d503201f        nop

  400f1c:       d503201f        nop


2. NOPRUNE_FUNC5


0000000000400ca0 <noprune_func5.default>:

  400ca0:       17ffffd8        b       400c00 <noprune_func4.default>


0000000000400f28 <noprune_func5._Mrng>:

  400f28:       7100003f        cmp     w1, #0x0

  400f2c:       5400042d        b.le    400fb0 <noprune_func5._Mrng+0x88>

  400f30:       51000422        sub     w2, w1, #0x1

  400f34:       7100085f        cmp     w2, #0x2

  400f38:       540003e9        b.ls    400fb4 <noprune_func5._Mrng+0x8c>  // b.plast

  400f3c:       53027c23        lsr     w3, w1, #2

  400f40:       aa0003e2        mov     x2, x0

  400f44:       8b235003        add     x3, x0, w3, uxtw #4

  400f48:       3dc0005f        ldr     q31, [x2]

  400f4c:       4ebf87ff        add     v31.4s, v31.4s, v31.4s

  400f50:       3c81045f        str     q31, [x2], #16

  400f54:       eb03005f        cmp     x2, x3

  400f58:       54ffff81        b.ne    400f48 <noprune_func5._Mrng+0x20>  // b.any

  400f5c:       121e7422        and     w2, w1, #0xfffffffc

  400f60:       6b02003f        cmp     w1, w2

  400f64:       54000260        b.eq    400fb0 <noprune_func5._Mrng+0x88>  // b.none

  400f68:       d37e7c43        ubfiz   x3, x2, #2, #32

  400f6c:       11000444        add     w4, w2, #0x1

  400f70:       b8636805        ldr     w5, [x0, x3]

  400f74:       531f78a5        lsl     w5, w5, #1

  400f78:       b8236805        str     w5, [x0, x3]

  400f7c:       6b04003f        cmp     w1, w4

  400f80:       5400018d        b.le    400fb0 <noprune_func5._Mrng+0x88>

  400f84:       91001065        add     x5, x3, #0x4

    ...


1. NOPRUNE_FUNC4


0000000000400c00 <noprune_func4.default>:

  400c00:       7100003f        cmp     w1, #0x0

  400c04:       5400042d        b.le    400c88 <noprune_func4.default+0x88>

  400c08:       51000422        sub     w2, w1, #0x1

  400c0c:       7100085f        cmp     w2, #0x2

  400c10:       540003e9        b.ls    400c8c <noprune_func4.default+0x8c>  // b.plast

  400c14:       53027c23        lsr     w3, w1, #2

  400c18:       aa0003e2        mov     x2, x0

  400c1c:       8b235003        add     x3, x0, w3, uxtw #4

  400c20:       3dc0005f        ldr     q31, [x2]

  400c24:       4ebf87ff        add     v31.4s, v31.4s, v31.4s

  400c28:       3c81045f        str     q31, [x2], #16

  400c2c:       eb03005f        cmp     x2, x3

  400c30:       54ffff81        b.ne    400c20 <noprune_func4.default+0x20>  // b.any

  400c34:       121e7422        and     w2, w1, #0xfffffffc

  400c38:       6b02003f        cmp     w1, w2

  400c3c:       54000260        b.eq    400c88 <noprune_func4.default+0x88>  // b.none

  400c40:       d37e7c43        ubfiz   x3, x2, #2, #32

  400c44:       11000444        add     w4, w2, #0x1

  400c48:       b8636805        ldr     w5, [x0, x3]

  400c4c:       531f78a5        lsl     w5, w5, #1

  400c50:       b8236805        str     w5, [x0, x3]

  400c54:       6b04003f        cmp     w1, w4

  400c58:       5400018d        b.le    400c88 <noprune_func4.default+0x88>

  400c5c:       91001065        add     x5, x3, #0x4

  400c60:       11000842        add     w2, w2, #0x2

  400c64:       b8656804        ldr     w4, [x0, x5]

  400c68:       531f7884        lsl     w4, w4, #1

  400c6c:       b8256804        str     w4, [x0, x5]


0000000000400fc0 <noprune_func4._Mrng>:

  400fc0:       17ffffda        b       400f28 <noprune_func5._Mrng>




End of Project Stage 3, Part 2.


Thank you so much for taking the time to read this continuation of part 1. Stay tuned to see the challenges I faced and reflections I learned from it all throughout the process. If you want to read about it, please go to Project Stage 3, Part 3 - Challenges and Reflection

Comments

Popular posts from this blog

BATCH 4 | Clone-Prune Analysis Code Pass On Both Architectures

BATCH 3 | Project Stage 2, Part 3 - Cloned Functions Comparison and Reflection

Lab 01 - Experiments, Calculating Performance and Modifying Code