BATCH 4 | Project Stage 3, Part 3 - Challenges and Reflection

April 17, 2025

Project Stage 3, Part 3

Challenges and Reflection

Hi, if you have been following along, this is a continuation of my Project Stage 3 blog series. In this post, I’ll walk you through the challenges I encountered and how I resolved them. Below, you’ll find links to related blog posts:

1. Project Stage 3, Part 1 - Tidy & Wrap

2. Project Stage 3, Part 2 - Test Cases & Results

Challenges and Resolutions:

Challenge # 1: Code Only Works On One Cloned Function

My original code assumed that there was only one cloned function which was the scale_samples function. Even though it worked as intended, stage 3 of the project requires analysis to work on multiple clones. This challenge took some time to figure out and eventually decided to change the approach by switching to storing all signatures in a vector per base name. The process includes identifying non-resolver clones and pushing it into the vector so that once this vector has a size of at least 2, a comparison between each entry against the first will used to decided whether functions should be pruned or not.

Challenge # 2: Ensuring Resolver Functions Do Not Interfere Comparisons

Since GCC creates a “.resolver” function in addition to the clones created, my code logic should ensure to filter out resolver functions along with the multiple cloned functions. Otherwise, this will only output NOPRUNE messages once resolver functions take part of the comparison logic. To resolve this, I ensured resolver functions are detected and skipped so the clones go into the map and get compared while ignoring all the resolver functions.

Challenge # 3: Ensuring It Passes On Both x86_64 and aarch64 Architectures

Working with both x86_64 and aarch64 architectures proved to be a real challenge. Each architecture has its own set of built-in function expansions and optimization behaviors, which significantly affect how functions are cloned or pruned during compilation. For instance, aarch64 may inline or expand certain functions differently than x86_64 due to the differences each architecture has. Ensuring that both architectures produce the same results such as the diagnostic messages for all the detected cloned functions.

To address this challenge, I modified my count_pass to focus only on user-defined functions and avoided relying on architecture-specific naming patterns or assumptions. Additionally, I added diagnostic output to .count files to make it easier to compare function behaviour between x86_64 and aarch64.

Please refer to this blog post to see the results of my clone-prune analysis pass when running on the x86_64 architecture: Clone-Prune Analysis Pass On Both Architectures

Reflection:

All throughout the process of ensuring my clone-prune analysis pass works on both architectures, I realized the importance of patience and determination. It is evident that working on this complex task, it will take time and persistence to achieve the results or requirements to complete this project. One of the hardest parts about all this was maintaining consistent logic for my clone-prune analysis across both architectures that required a deep dive into these target-specific behaviours. Even though I eventually got the pass working as intended on both, it took a lot of understanding of how the control flow and GIMPLE transformations differed between them. As a final project for this course, this was truly an experience I will take with me to my future career as it taught me how to keep going despite the challenges I faced!

Search This Blog

SPO 600 Blogs