<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>Julia Community 🟣: José Pereira</title>
    <description>The latest articles on Julia Community 🟣 by José Pereira (@josepereiraua).</description>
    <link>https://forem.julialang.org/josepereiraua</link>
    <image>
      <url>https://forem.julialang.org/images/fNgJa9dMPTUTAxd_KF7MI4sUWtI6tFzWEFR4vXO8ZE0/rs:fill:90:90/g:sm/mb:500000/ar:1/aHR0cHM6Ly9mb3Jl/bS5qdWxpYWxhbmcu/b3JnL3JlbW90ZWlt/YWdlcy91cGxvYWRz/L3VzZXIvcHJvZmls/ZV9pbWFnZS80MjYv/M2NmMjc2YzEtY2Rh/MC00ZWYxLThkZjct/ZTQyNmRhZmJhZTI1/LmpwZw</url>
      <title>Julia Community 🟣: José Pereira</title>
      <link>https://forem.julialang.org/josepereiraua</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://forem.julialang.org/feed/josepereiraua"/>
    <language>en</language>
    <item>
      <title>Julia for protein design?</title>
      <dc:creator>José Pereira</dc:creator>
      <pubDate>Tue, 07 Jun 2022 17:55:30 +0000</pubDate>
      <link>https://forem.julialang.org/josepereiraua/julia-for-protein-design-43na</link>
      <guid>https://forem.julialang.org/josepereiraua/julia-for-protein-design-43na</guid>
      <description>&lt;p&gt;By now, it's undeniable that Julia has been gaining traction in the field of scientific computing and &lt;a href="https://forem.julialang.org/emiller/why-im-hyped-about-julia-for-bioinformatics-2ihj"&gt;bioinformatics&lt;/a&gt;, with amazing packages such as &lt;a href="https://biojulia.net/"&gt;BioJulia&lt;/a&gt; and super active forums of discussion, such as the &lt;strong&gt;#biology&lt;/strong&gt; channel at Julia's Slack.&lt;/p&gt;

&lt;p&gt;In this quick overview I would like to dive a little deeper into a more niche topic of scientific computing: protein design.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is protein design?
&lt;/h2&gt;

&lt;p&gt;Proteins are the workhorse of nature, with a multitude of functions from structural, to transport, enzimatic, hormonal and even immunal response. All this versatility is a product of a "simple" (yet beautiful) combination of just 20 different amino acids - the building blocks of life. Once synthesized, sequences of aminoacids &lt;strong&gt;fold&lt;/strong&gt; into a given conformation, and it is this structural organization that confers a specific task in the context of the cell.&lt;/p&gt;

&lt;p&gt;In short, protein design is the scientific area of research that attempts to generate sequences of amino acids “à la carte” that will fold into unnatural conformations with novel activities or behaviors.&lt;/p&gt;

&lt;p&gt;This has traditionally been performed "blindly": new sequences were generated at random (sometimes via radiation-induced mutations) just to see what happens! As you may guess, this was horrendously expensive and time consuming. &lt;/p&gt;

&lt;p&gt;In the last few decades, however, a new player has entered the game: computationally aided design (a.k.a. CAD).&lt;/p&gt;

&lt;p&gt;In this new paradigm, protein squences are simulated "in-silico" beforehand, with prototypes being filtered for propective candidates with a much higher throughput than even the wildest dreams of a couple decades ago.&lt;/p&gt;

&lt;h2&gt;
  
  
  Software for protein design: are we done?
&lt;/h2&gt;

&lt;p&gt;The history of computational protein design (despite being a somewhat young and fresh field) is rich and filled with breakthroughs. I suggest reading &lt;a href="https://onlinelibrary.wiley.com/doi/10.1002/pro.4098"&gt;this review&lt;/a&gt; on the topic.&lt;/p&gt;

&lt;p&gt;A common development architecture has, however, emerged: in order to simulate sequence designs and evaluate how "good" or "bad" they are, two fundamental pieces of software are required:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;A sampling motor: a way to introduce change to a protein (to manipulate the particles in the system), think, for example, a way to introduce a mutation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;An energy function: a way to evaluate the current system on how "real" or how "adjusted" it is (sometimes also called a "fitness function").&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Both these requirements have been thoroughly explored in the past. Arguably the most sucesseful experiment to date belongs to &lt;a href="https://www.rosettacommons.org/software"&gt;Rosetta&lt;/a&gt; (and it's Python wrapper, &lt;a href="https://www.pyrosetta.org/"&gt;PyRosetta&lt;/a&gt;), under the supervision of Dr. David Baker.&lt;/p&gt;

&lt;p&gt;Honestly, I think I could probably write a 100-page page on how much Rosetta changed the landscape of computational protein design! However, are we done? Is this the best we can do?&lt;/p&gt;

&lt;p&gt;Well ...&lt;/p&gt;

&lt;h2&gt;
  
  
  The good, the bad and the ugly
&lt;/h2&gt;

&lt;p&gt;I don't think we're done. &lt;strong&gt;Rosetta is awesome.&lt;/strong&gt; Rosetta revolutionized the way we make science and the way we engineer proteins for our human things. &lt;strong&gt;But Rosetta is not perfect.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Having had to learn the inner-workings of PyRosetta during my PhD, I can safely argue that Rosetta does not constitute an example of a modern API for protein design. Here's why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Rosetta (C++) and PyRosetta (Python) are a perfect example of the two-language problem;&lt;/li&gt;
&lt;li&gt;The Rosetta software is, for the most part, a patchwork of individual applications for specific uses, interlaced with complicated mechanisms, such as the RosettaScripts (an XML-based syntax for setting up algorithms with multiple of Rosetta's functionalities);&lt;/li&gt;
&lt;li&gt;The PyRosetta's documentation is infamous for its lack of information and outdated examples;&lt;/li&gt;
&lt;li&gt;Rosetta does not directly benefit from modern hardware, such as GPU or distributed computing;&lt;/li&gt;
&lt;li&gt;Rosetta is not an open-source project (and, given the lack of documentation, virtually impossible to modify);&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In short: &lt;strong&gt;we can do better.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next?
&lt;/h2&gt;

&lt;p&gt;I see in Julia the path forward for protein design software. I'll take the chance to add a shameless plug to my own PhD work, where I try to tackle this same problem and develop a modern approach to protein design: &lt;a href="https://github.com/sergio-santos-group/ProtoSyn.jl"&gt;ProtoSyn.jl&lt;/a&gt;. Albeit still a work-in-progress, I hope future users (and contributors) find in ProtoSyn.jl a home for all things related to molecular manipulation &amp;amp; simulation, with (of course) a strong emphasis on protein design. I'll keep this short and, without going too much into details, here's an incomplete list of features that I feel should shape a modern Julia-based approach to protein design:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complete molecular manipulation tools&lt;/li&gt;
&lt;li&gt;Fast &amp;amp; native energy functions, with optional plug-and-play support for established energy functions for external packages&lt;/li&gt;
&lt;li&gt;GPU &amp;amp; distributed computing support&lt;/li&gt;
&lt;li&gt;Incorporation of non-canonical aminoacids&lt;/li&gt;
&lt;li&gt;Addition of post-translational modifications&lt;/li&gt;
&lt;li&gt;Support for ramified peptides and glicoproteins&lt;/li&gt;
&lt;li&gt;Support for common optimization simulations (steepest descent, monte carlo, etc)&lt;/li&gt;
&lt;li&gt;Full suite of up-to-date examples, tutorials and documentation&lt;/li&gt;
&lt;li&gt;Free pizza&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Well, ProtoSyn.jl does almost all of the above (we still haven't figured out how to get free pizza)!&lt;/p&gt;

&lt;p&gt;I'll finish this post here: with a huge enthusiasm for what's to come. Hopefully, Julia shapes the future of protein design. &lt;strong&gt;The potential is all there!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>protein</category>
      <category>design</category>
      <category>chemistry</category>
      <category>simulation</category>
    </item>
  </channel>
</rss>
