<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en"><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://oke-aditya.github.io/feed.xml" rel="self" type="application/atom+xml"/><link href="https://oke-aditya.github.io/" rel="alternate" type="text/html" hreflang="en"/><updated>2026-03-03T07:11:20+00:00</updated><id>https://oke-aditya.github.io/feed.xml</id><title type="html">blank</title><subtitle>MS student at CMU researching LLM safety, robustness, and interpretability for agentic systems. </subtitle><entry xml:lang="en"><title type="html">Challenges with Machine Learning Projects</title><link href="https://oke-aditya.github.io/challenges-ml-projects/" rel="alternate" type="text/html" title="Challenges with Machine Learning Projects"/><published>2020-12-25T00:00:00+00:00</published><updated>2020-12-25T00:00:00+00:00</updated><id>https://oke-aditya.github.io/challenges-ml-projects</id><content type="html" xml:base="https://oke-aditya.github.io/challenges-ml-projects/"><![CDATA[<h2 id="introduction">Introduction</h2> <p>Machine Learning Projects are hard to make. Maintaining them is even much harder. Code becomes complex and end of month or year, it becomes impossible to understand it.</p> <h2 id="why-do-projects-become-spaghetti-code">Why do projects become spaghetti code?</h2> <h3 id="lack-of-project-structure">Lack of project structure</h3> <p>Projects that lack structure initially later become a mess. Usually, this occurs when all code is dumped into a <code class="language-plaintext highlighter-rouge">src</code> folder. This folder gets highly polluted and later it becomes hard to understand the codebase. It is wise to divide the codebase into folders and subfolders, which make it easier to understand code.</p> <h3 id="not-documenting-the-code-or-work">Not documenting the code or work</h3> <p>Code that looks obvious right now becomes very hard to understand later. Remember that code written should be reusable and accessible to all people. Easy methods such as type hinting and docstrings are sufficient and make code clear.</p> <p>For example, consider a function that takes a tuple denoting a rectangle in <code class="language-plaintext highlighter-rouge">(x, y, w, h)</code> format:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">foo</span><span class="p">(</span><span class="n">l</span><span class="p">):</span>
  <span class="n">a</span> <span class="o">=</span> <span class="n">l</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">*</span> <span class="n">l</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span>
  <span class="k">return</span> <span class="n">a</span>
</code></pre></div></div> <p>The above code is functional, but without documentation, it would be hard to understand. Let’s improve it:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">area_rect</span><span class="p">(</span><span class="n">rect</span><span class="p">:</span> <span class="n">Tuple</span><span class="p">):</span>
  <span class="sh">"""</span><span class="s">
  Arguments:
    rect (Tuple): A tuple denoting a rectangle in (x, y, w, h)
    format. Where x, y are coordinates and w, h are width
    and height.
  Returns (Int):
    Area of the rectangle.
  </span><span class="sh">"""</span>
  <span class="n">area</span> <span class="o">=</span> <span class="n">rect</span><span class="p">[</span><span class="mi">2</span><span class="p">]</span> <span class="o">*</span> <span class="n">rect</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span>
  <span class="k">return</span> <span class="n">area</span>
</code></pre></div></div> <p>How simple yet so powerful documenting can be!</p> <h3 id="missing-requirements-not-specifying-how-to-use-the-project">Missing requirements, not specifying how to use the project</h3> <p>Small additions, such as having a <code class="language-plaintext highlighter-rouge">requirements.txt</code> file, can help people replicate your computer packages. This ensures consistency with your work.</p> <p>Having a proper <code class="language-plaintext highlighter-rouge">README</code> that describes how the project is structured and how it should be run is really helpful. It helps people replicate your code and try it for their work.</p> <h3 id="not-using-functions">Not using functions</h3> <p>Many people work without writing functions. This does not make code modular and harder to comprehend. Having functions with docstrings and using them keeps content clear and easier to understand. Writing functions at the top of the file and calling them in <code class="language-plaintext highlighter-rouge">main</code> is easier. For example:</p> <div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="sh">"</span><span class="s">__main__</span><span class="sh">"</span><span class="p">:</span>
  <span class="n">rect</span> <span class="o">=</span> <span class="p">[</span><span class="mi">0</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="mi">10</span><span class="p">,</span> <span class="mi">20</span><span class="p">]</span>
  <span class="n">res</span> <span class="o">=</span> <span class="nf">area_rect</span><span class="p">(</span><span class="n">rect</span><span class="p">)</span>
</code></pre></div></div> <h3 id="overengineering">Overengineering</h3> <p>While the above examples are cases of underengineering or not following simple practices, overengineering is another issue. Sometimes, people try to do too much and reinvent the wheel. There are multiple well-supported and documented libraries available. Try to use those and their methods; most functions are already present and work really well.</p> <p>That’s all for this blog! Hopefully, your next project doesn’t become spaghetti code!</p>]]></content><author><name></name></author><category term="machine-learning"/><category term="Machine Learning"/><summary type="html"><![CDATA[Why machine learning projects often devolve into spaghetti code and how to avoid it.]]></summary></entry></feed>