How to use Anglerfish
Query setup
You need to provide a query molecule. Possible targets and bioactivities will be predicted for this query molecule based on its similarity with known active molecules.
The query molecules can be provided by two different methods:
- By uploading an SDF file containing only the query molecule; or
- By pasting or writing the query molecule in SMILES or InChI format in the text box.
Now, the different types of molecular fingerprint to be used for the similarity searches can be chosen. The more fingerprint types are chosen, the longer will it take for the search to complete.
Once you have chosen the fingerprint types you would like to use, click the "Submit" button.
(Don't forget to fill in the CAPTCHA)
If anything went wrong, you will see a notification indicating what needs to be changed.
Otherwise, you will be presented with a 2D preview of your query molecule, its SMILES and the fingerprint types you chose.
If there's something wrong (e.g., you forgot one fingerprint type or you mistyped the SMILES in the previous step and the molecule is wrong) you can go back and fix it.
If everything is alright, you can proceed to submit the search job.
Waiting for the results
You will be now presented with a page showing the status of your query.
The possible statuses are:
- QUEUED: The query is scheduled to run once the server load allows.
- RUNNING: The query is being executed. This can last several minutes, depending on server load, number of fingerprint types chosen, query molecule size, etc.
- ERROR, other: Something went wrong with your query. Please, contact us at adria.cereto _at_ urv.cat to see if we can fix the issue.
The URL to this page can be saved and accessed later to check the query status.
Once it has finished, it will redirect to the results page.
Results page
The results page offers a brief summary of the query and a table with the results grouped by target.
The columns of the table are:
- Target Name
- Expected pX : the predicted activity value of the query molecule on this target, based on similarity to its actives
- Max pX, Min pX: the maximal and minimal pX values for this target of molecules similar tot he query molecules, for reference.
- Max/Min Average Tanimoto: the maximal and minimal average similarity (across the different chosen fingerprint types) to an active for this target.
- Hits : the number of known active molecules for this target which are similar to the query molecule. Clicking this value opens the Detailed Results Page for this target.
The result table features a search box that supports advanced search arguments and filters.
The result table can be downloaded as a CSV file, easily imported into Excel or other spreadsheet software.
There is also a link to repeat the process, but starting from the used settings (molecule and chosen fingerprints), which can then be tuned.
Detailed Results page
This page is similar to the Results page , but it doesn't group results by target and provides detailed data on the similarities between the query molecule and each active.
The table includes:
- Target name
- Standard Type: the type of bioactivy upon which the pX is based (Ki, IC50, Kb, etc.)
- Active : the CHEMBL id of the active molecule
- Tanimoto: Tanimoto similarity values between the query molecule and the active for each of the chosen fingerprint types, and their average.
The detailed results table features the same advanced search and export capabilities as the previous results table.
RESTful API
Anglerfish can also be used through a RESTful API.
It supports the POST method to submit a query, and the GET method to retrieve its status and results.
To submit a new search, submit a POST request to http://anglerfish.urv.cat/anglerfish/api/search with the following variables in JSON:
-
"
fptype"
: containing a list of the fingerprints to use. If omitted, the default ones (["OpenBabel-MACCS", "OpenBabel-FP3", "RDKit-Fingerprint"]) will be used. The supported types are the following:
- "RDKit-MACCS166"
- "RDKit-Fingerprint"
- "RDKit-Morgan"
- "RDKit-Torsion"
- "RDKit-AtomPair"
- "OpenBabel-FP2"
- "OpenBabel-FP3"
- "OpenBabel-FP4"
- "OpenBabel-MACCS"
- "smiles" OR "inchi": containing the query molecule in the format indicated by the string. If neither "inchi" nor "smiles" is provided, or both are provided, the submission will fail.
The POST method, if successful, will return a JSON file with one variable, " id ", which will be a long string that will look similar to "23a17a0c-4c277ebfcb9b68a340a0t087c-75f8053072f678043c0y3f072f57f-723477fy".
With this unique ID, you can access the results of the query with the GET method, thrugh the URL http://anglerfish.urv.cat/anglerfish/api/get/YOUR_SUBMISSION_ID .
This ID can also be used to brose the query results through the web interface, going to http://anglerfish.urv.cat/anglerfish/wait?id=YOUR_SUBMISSION_ID.
The GET method, if successful, will return a JSON file with the following fields, depending on the success of the query:
- "status": Status of the query (QUEUED, RUNNING, COMPLETED or FAILED)
- "fptypes" : the fingerprint types chosen for the similarity search
- "traceback" : in case of an error during the search, this field will contain details that would help debug it
- "query": An SDF representation of the query molecule
- "result_table": the search results (in JSON)
Here is an example of API usage with cURL:
curl -d '{"smiles":"CCC", "fptypes":["RDKit-Morgan"]}' -H "Content-Type: application/json" -X POST http://anglerfish.urv.cat/anglerfish/api/search --output post_result.json
The "post_result.json" file will look something like this:
{"id": "23a17a0c-4c277ebfcb9b68a340a0t087c-75f8053072f678043c0y3f072f57f-723477fy"}
With this ID we can now retrieve the status and results of the search:
curl -X GET http://anglerfish.urv.cat/anglerfish/api/get/23a17a0c-4c277ebfcb9b68a340a0t087c-75f8053072f678043c0y3f072f57f-723477fy --output results.json
The results.json file will contain all the results and associated data, as described above. Keep in mind that it can be upwards of 50 MB in size, so it may not be advisable to try to open it with a text editor