HARPA AI | Custom Deep Research Bot

HARPA.AI
Performs in-depth online research using your custom Google queries to answer your questions. First, create up to 4 specific Google searches to gather information, then ask your actual question to get a focused AI response based on those search results

Created by Morteza H.
Updated on Jul 16, 21:12
Installed 60 times
RUNS JS CODE
How to Use

Get HARPA AI from https://get.harpa.ai
Open HARPA AI with Alt+A on this page
Click IMPORT COMMAND
IMPORT COMMAND
Content

- type: say
  message: >-
    **Tip:** This command performs a thorough in-depth online search for a given
    query or topic and is best used for research. It takes time and is token
    hungry (~20k tokens on GPT-4o-mini per search depth).
  label: TIP ABOUT COMMAND
- param: depth
  message: >-
    **Set the search depth.**


    Search depth indicates how many search pages will be analyzed. Each depth
    level generates an extra sub-query, improving search quality.
  options:
    - label: 🔍 SEARCH
      value: 1
    - label: 2 SUB-QUERIES
      value: 2
    - label: 3 SUB-QUERIES
      value: 3
    - label: 4 SUB-QUERIES
      value: 4
    - label: 🔗 SEARCH BY URLs
      value: search_by_urls
  optionsInvalid: false
  type: ask
  default: ''
  label: SET DEPTH LEVEL
- param: all_links
  message: >-
    **Add Research URLs**


    Paste one or more URLs you'd like the AI to focus on for this research task.
    **Only these sources will be used**

    You can enter multiple URLs.
  options: null
  optionsInvalid: false
  type: ask
  default: ''
  label: SEARCH BY URLS
  condition: '{{depth}} = search_by_urls'
- type: ask
  param: placeholder
  message: >-
    **Model Usage Tip**

    To save tokens and reduce costs, you can switch to a **cheaper AI model**
    (like GPT-4.1 mini) when extracting information from the websites.

    For the **final answer**, you’ll have the option to choose a **more advanced
    model** to ensure the best possible results.

    Click **“✅ DONE”** to proceed. You can change the model anytime via the **⋮
    menu** or with **ALT + C**.
  options:
    - label: ✅ DONE
      value: placeholder
  default: ''
  vision:
    enabled: false
    mode: area
    hint: ''
    send: true
  optionsInvalid: false
  label: TIP FOR MODEL USAGE TIP | DATA EXTRACTION
- type: jump
  to: RESEARCH PROMPT INSTRUCTION
  condition: '{{depth}} = search_by_urls'
- message: >-
    | Operator | Description | Example |

    |----------|------------|----------|

    | site: | Restricts results to a specific domain | site:arxiv.org
    "transformer model" |

    | -site: | Excludes a specific domain from results | AI research
    -site:wikipedia.org |

    | "..." | Forces exact phrase match | "machine learning architecture" |

    | filetype: | Searches for specific file types | filetype:pdf "reinforcement
    learning" |

    | intitle: | Finds pages with specific term in title | intitle:AI benchmarks
    |

    | allintitle: | All terms must appear in title | allintitle:machine learning
    python |

    | intext: | Finds pages with specific term in body text | intext:neural
    network topology |

    | allintext: | All terms must appear in body text | allintext:GPT
    architecture training |

    | define: | Shows definition of a term | define:neural network |

    | OR \| | Searches for either term | GPT-4 OR "large language model" |

    | AND | Requires both terms to be present | transformer AND "attention
    mechanism" |

    | inurl: | Finds pages with specific term in URL |
    inurl:artificial-intelligence |

    | * | Wildcard for any word | "deep * model" |

    | -term | Excludes a specific term | AI -chatbot |

    | () | Groups search terms for complex queries | (GPT-4 OR LLaMA)
    architecture |

    | before: after: | Time-based filtering | AI breakthroughs after:2023 |
  label: TIP ABOUT SEARCH OPERATOR
  type: say
- param: userQuery1
  message: >-
    ![icon](/img/commands/general-search-refraction.svg) Please enter {{depth}}
    search queries that will be used to retrieve Google search results:


    **Tip:** You have the option to improve your search query for better
    results.

    Choose the option below or write your own query manually:
  options:
    - label: ✅ IMPROVE MY QUERY
      value: improve
    - $custom
  condition: '{{depth}} >= 1'
  label: ASK FIRST SEARCH QUERY
  type: ask
  default: ''
  optionsInvalid: false
- param: userQuery
  message: >-
    Great! Please describe what you're looking for in your own words.

    We'll improve your query to get you the best possible results. Keep it clear
    and detailed if you can.
  options: null
  condition: '{{userQuery1}} = improve'
  label: IMRPOVE MY QUERY
  type: ask
  default: ''
  optionsInvalid: false
- type: gpt
  prompt: >-
    # Prompt: Broad Coverage Single Google Search Query Generator


    You are an expert Google Search Strategist. Your specialty is dissecting a
    user's information need and constructing the **single broadest, yet
    relevant,** Google search query using advanced syntax to maximize the number
    of potential results.


    OUTPUT LANGUAGE: Respond **only** in the language of the **LANGUAGE BASED ON
    USER INPUT**.


    TASK:

    Analyze the **USER INPUT** (topic, question, or keywords) to understand the
    core information need. Generate **one single** Google search query optimized
    for **maximum relevant coverage and result count**.


    Key Techniques & Context for Maximizing Results:

    Leverage Google's advanced search operators strategically, focusing on
    breadth:

    -   `OR`: Use extensively to include synonyms, related concepts, or
    alternative phrasings for core keywords. This is the primary tool for
    broadening the search.

    -   `( )`: Use to group `OR` statements or structure complex queries
    logically, ensuring correct operator precedence. Example: `(keywordA OR
    keywordB) AND (topicX OR topicY)`.

    -   `" "` (Single Crucial Word ONLY):** Apply quotation marks **only** if
    there is **one single keyword** that is absolutely *crucial* and must appear
    exactly as typed (e.g., to distinguish it from homonyms or ensure its
    presence). **Do NOT use `" "` for phrases or multiple words.** If no single
    word is that critical, do not use `" "`.

    -   Core Keywords: Identify the essential concepts but consider broader
    terms or related ideas.

    -   `AND`: Note that Google implies `AND` between terms by default.
    Explicitly writing `AND` is usually unnecessary and does not broaden the
    search compared to just listing terms. Focus on `OR` for expansion.

    -   Other Operators (`site:`, `-`, `*`, `intitle:`, `filetype:`):** Use
    these sparingly and only if they directly support the goal of broad,
    relevant coverage without unduly restricting results (e.g., `site:` could be
    used with `OR` across multiple relevant sites, `-` could exclude a major
    *irrelevant* topic).


    Inferential Capability & Query Selection:

    -   Keywords/Phrases: Extract core concepts. Brainstorm synonyms and related
    terms suitable for `OR` combinations.

    -   Crucial Word Identification: Determine if *one single word* meets the
    strict criteria for using `" "`. If not, omit `" "`.

    -   Structure: Construct the single query prioritizing broad keyword
    inclusion via `OR` and logical grouping with `( )`. Balance breadth with
    maintaining relevance to the user's core need.


    OUTPUT FORMAT:

    Present **only** the final, optimized Google search query string. Do not
    include any labels, explanations, introductory/concluding remarks, or any
    text other than the single query string itself.


    CONSTRAINTS:

    -   Generate exactly **one** optimized query string designed for maximum
    relevant results.

    -   Prioritize the use of `OR` and `( )` for breadth.

    -   Use `" "` **only** around a single, critical keyword if absolutely
    necessary, never for phrases.

    -   Output **only** the query string.

    -   Ensure the generated query is immediately copy-paste usable in Google.

    -   Adhere strictly to the specified output language (based on USER INPUT).


    BEGIN USER INPUT:

    {{userQuery}}
  label: BROAD COVERAGE SINGLE GOOGLE SEARCH QUERY GENERATOR
  condition: '{{userQuery1}} = improve'
  isolated: true
  param: userQuery1
  silent: true
- type: say
  message: |-
    🌐 Query created and will be used for google search:
    ```
    {{userQuery1}}
    ```
  label: DISPLAY IMPROVED QUERY
  condition: '{{userQuery}} != —'
- param: autoQueries.boolean
  message: >-
    Would you like to have similar sub-queries automatically generated by an AI
    based on the first sub-query you specified?
  options:
    - label: ✅ YES
      value: 'yes'
    - label: ⛔ NO
      value: 'no'
  vision:
    mode: area
    send: true
    enabled: false
    hint: ''
  label: AUTO QUERIES BOOLEAN
  condition: '{{depth}} > 1'
  type: ask
  default: ''
  optionsInvalid: false
- type: group
  steps:
    - param: userQuery2
      condition: '{{depth}} >= 2'
      message: ''
      type: ask
      options: null
      default: ''
    - param: userQuery3
      condition: '{{depth}} >= 3'
      message: ''
      type: ask
      options: null
      default: ''
    - param: userQuery4
      condition: '{{depth}} >= 4'
      message: ''
      type: ask
      options: null
      default: ''
  label: ASK SEARCH QUERIES
  condition: '{{autoQueries.boolean}} = no'
- type: ask
  param: instruction
  message: >-
    **Set Your Research Prompt**

    Enter a clear question or instruction the AI should follow when analyzing
    the websites.

    This prompt will be applied to **each website**, whether it comes from
    **your provided URLs** or from **Google search results**.

    Be as specific as possible for best results.
  options: null
  default: ''
  vision:
    enabled: false
    mode: area
    hint: ''
    send: true
  label: RESEARCH PROMPT INSTRUCTION
- param: question
  message: >-
    Please enter your specific query now. This query will form the basis of the
    **FINAL** AI-generated detailed response. The more precisely you phrase your
    question or request, the better the AI will be able to address it based on
    the information it has already gathered. What would you like to know?
  vision:
    enabled: false
    mode: area
    hint: ''
    send: true
  label: THE REAL AND SPECIFIC QUERY FROM USER
  type: ask
  options: null
  default: ''
- type: jump
  to: EXTRACT ALL INPUT LINKS
  condition: '{{depth}} = search_by_urls'
- steps:
    - prompt: >-
        # Prompt: Broad Coverage Multiple Google Search Queries Generator


        You are an expert Google Search Strategist. Your specialty is dissecting
        a user's information need and constructing **multiple broad, yet
        relevant** Google search queries using advanced syntax to maximize
        coverage of potential results.


        OUTPUT LANGUAGE: Respond **only** in the language of the **LANGUAGE
        BASED ON USER INPUT**.


        TASK:

        Analyze the **USER INPUT** (topic, question, or keywords) to understand
        the core information need. Generate **{{depth}} different** Google
        search queries optimized for **maximum relevant coverage and result
        count**.


        Key Techniques & Context for Maximizing Results:

        Leverage Google's advanced search operators strategically, focusing on
        breadth:

        -   `OR`: Use extensively to include synonyms, related concepts, or
        alternative phrasings for core keywords. This is the primary tool for
        broadening the search.

        -   `( )`: Use to group `OR` statements or structure complex queries
        logically, ensuring correct operator precedence. Example: `(keywordA OR
        keywordB) AND (topicX OR topicY)`.

        -   `" "` (Single Crucial Word ONLY):** Apply quotation marks **only**
        if there is **one single keyword** that is absolutely *crucial* and must
        appear exactly as typed (e.g., to distinguish it from homonyms or ensure
        its presence). **Do NOT use `" "` for phrases or multiple words.** If no
        single word is that critical, do not use `" "`.

        -   Core Keywords: Identify the essential concepts but consider broader
        terms or related ideas.

        -   `AND`: Note that Google implies `AND` between terms by default.
        Explicitly writing `AND` is usually unnecessary and does not broaden the
        search compared to just listing terms. Focus on `OR` for expansion.

        -   Other Operators (`site:`, `-`, `*`, `intitle:`, `filetype:`):** Use
        these sparingly and only if they directly support the goal of broad,
        relevant coverage without unduly restricting results (e.g., `site:`
        could be used with `OR` across multiple relevant sites, `-` could
        exclude a major *irrelevant* topic).


        Inferential Capability & Query Selection:

        -   Keywords/Phrases: Extract core concepts. Brainstorm synonyms and
        related terms suitable for `OR` combinations.

        -   Crucial Word Identification: Determine if *one single word* meets
        the strict criteria for using `" "`. If not, omit `" "`.

        -   Structure: Construct multiple queries prioritizing broad keyword
        inclusion via `OR` and logical grouping with `( )`. Balance breadth with
        maintaining relevance to the user's core need.


        OUTPUT FORMAT:

        Present your response as a JSON array with {{depth}} objects, each
        containing a "query" field with an optimized search query. Do not
        include any labels, explanations, introductory/concluding remarks, or
        any text other than the JSON array.


        Example JSON response:


        [
          {
            "query": "(eco-friendly OR sustainable OR green OR biodegradable) (packaging OR containers OR wrapping) (alternatives OR options OR solutions) (small business OR SME OR startup OR entrepreneur)"
          },
          {
            "query": "(cardboard OR paper OR bamboo OR hemp OR mushroom OR seaweed) packaging (small business OR retail OR SME) (sustainable OR eco-friendly)"
          },
          {
            "query": "(affordable OR budget OR cost-effective) (eco-friendly OR sustainable) packaging (implementation OR transition OR switch) (small business OR SME OR startup)"
          },
          {
            "query": "(suppliers OR vendors OR manufacturers OR brands) (eco-friendly OR green OR sustainable) packaging (small business OR SME OR local business)"
          }
        ]


        CONSTRAINTS:

        -   Generate exactly {{depth}} optimized query strings designed for
        maximum relevant results.

        -   Each query should focus on a different aspect or approach to the
        user's question.

        -   Prioritize the use of `OR` and `( )` for breadth.

        -   Use `" "` **only** around single, critical keywords if absolutely
        necessary, never for phrases.

        -   Output **only** the JSON array containing the queries.

        -   Ensure the generated queries are immediately copy-paste usable in
        Google.

        -   Adhere strictly to the specified output language (based on USER
        INPUT).


        BEGIN USER INPUT:

        {{userQuery1}}
      param: subqueries
      condition: '{{autoQueries.boolean}} = yes'
      type: gpt
      label: CREATE SUBQUERIES
      silent: true
      dumb: false
      isolated: true
    - type: js
      args: userQuery1, userQuery2, userQuery3, userQuery4
      code: |-
        function convertToJsonArray(args) {
          // Extract query parameters
          const queries = [
            args.userQuery1 || "",
            args.userQuery2 || "",
            args.userQuery3 || "",
            args.userQuery4 || ""
          ];
          
          // Filter out queries that are too short (less than 5 characters)
          const validQueries = queries
            .filter(query => query && query.trim().length >= 5)
            .map(query => ({ query: query.trim() }));
          
          // The depth is automatically determined by the number of valid queries
          // No need to slice the array as we only include valid queries
          
          // Return the JavaScript array object directly
          return validQueries;
        }

        // Execute with args from HARPA
        return convertToJsonArray(args);
      param: subqueries
      timeout: 15000
      onFailure: ''
      silent: true
      label: CONVERT USER QUERIES TO JSON
      condition:
        - '{{autoQueries.boolean}} = no'
        - '{{autoQueries.boolean}} = —'
    - type: calc
      func: extract-json
      to: subqueries
      param: subqueries
      index: ''
    - message: >-
        🧩 To give a comprehensive response, your question was split into
        **{{subqueries.length}}** sub-queries.
      type: say
    - steps:
        - value: '{{item.query}}'
          func: set
          param: query
          format: text
          type: calc
        - message: '⏳ Scanning information for sub-query: "**{{query}}**".'
          type: say
        - param: information
          value: '{{serp {{query}}}}'
          type: calc
          func: set
          format: ''
        - func: serp.extract-links
          to: links
          type: calc
          from: information
        - param: total
          value: '{{links.length}}'
          format: number
          type: calc
          func: set
        - type: js
          args: all_links
          code: |-
            function extractAllLinksAsObjects(args) {
              const text = args.all_links;
              
              if (!text) {
                return [];
              }
              
              // Kombiniertes Regex Pattern für:
              // 1. Markdown Links: [text](url)
              // 2. Normale URLs: http(s)://example.com
              const urlPattern = /$$[^$$]+$$$$([^)]+)$$|https?:\/\/[^\s<>)"]+/g;
              
              // Array to store all link objects
              const links = [];
              // Set to keep track of found URLs to avoid duplicates
              const foundUrls = new Set();
              
              // Find all matches
              let match;
              while ((match = urlPattern.exec(text)) !== null) {
                // If it's a markdown link (has capture group), use the URL from group
                // Otherwise use the full match (for plain URLs)
                const url = match[1] || match[0];
                
                // Avoid duplicates
                if (!foundUrls.has(url)) {
                  foundUrls.add(url);
                  links.push({ url: url });
                }
              }
              
              return links;
            }

            return extractAllLinksAsObjects(args);
          param: links
          timeout: 15000
          onFailure: ''
          silent: true
          label: EXTRACT ALL INPUT LINKS
          condition: '{{depth}} = search_by_urls'
        - type: calc
          func: set
          param: total
          format: ''
          value: '{{links.length}}'
          label: SET LENGTH OF ARRAY LINKS
          condition: '{{depth}} = search_by_urls'
        - type: say
          message: 🤖 Now preparing a comprehensive answer to your question.
          label: SAY PREPARING MESSAGES
          condition: '{{depth}} = search_by_urls'
        - steps:
            - value: '{{page {{item.url}}}}'
              param: content
              type: calc
              func: set
              format: ''
            - type: js
              code: |-
                const content = args['content']; 
                return { 
                 chars: content.length, 
                 estimatedTokens: Math.ceil(content.length / 4), 
                 estimatedWords: Math.ceil(content.length / 4 * 0.75) 
                };
              param: count
              timeout: 15000
              onFailure: SAY STATUS
              label: COUNT
              args: content
              silent: true
            - param: pageUrl
              value: '{{item.url}}'
              type: calc
              func: set
              format: text
            - code: |-
                const regex = /^(?:https?:\/\/)?(?:www\.)?([^\/]+)/;
                const testString = pageUrl;
                const matches = testString.match(regex);

                if (matches) {
                  const hostname = matches[1];
                  return hostname; 
                }
              param: hostname
              label: HOSTNAME
              type: js
              args: pageUrl
              timeout: 15000
              silent: true
            - func: increment
              param: index
              type: calc
              delta: 1
            - args: index, total
              code: |-
                function calculatePercentage(index, total) {
                    index = Number(index);
                    total = Number(total);

                    if (isNaN(index) || isNaN(total)) {
                        return "Error";
                    }

                    return (index / total * 100).toFixed(1) + "%";
                }

                let result = calculatePercentage(index, total);
                return result;
              param: percentage
              label: PERCENTAGE INDEX TOTAL
              type: js
              timeout: 15000
              silent: true
            - steps:
                - steps:
                    - prompt: >-
                        # Enhanced Web Content Extractor


                        You are a precise information extraction specialist.


                        OUTPUT LANGUAGE: LANGUAGE BASED ON WEB PAGE CONTENT

                        MAIN QUESTION:

                        {{instruction}}


                        TASK: Imagine you are a Google Search engine, collecting
                        relevant information from various websites to form an
                        answer to my QUESTION.


                        EXTRACTION INSTRUCTIONS:

                        - Focus on providing detailed factual information,
                        statistics, and specific examples when available

                        - Gather all information that will be useful in
                        answering my question, including context, numbers, dates
                        and concrete details

                        - Extract as much useful information as possible from
                        the WEB PAGE CONTENT

                        - Avoid general phrases

                        - When referring to someone's opinion about something,
                        mention the author. Quote if appropriate

                        - Include supporting details and contextual information 

                        - Capture all numerical data and specific examples


                        FORMAT: Respond ONLY with this JSON structure:


                        {
                          "info": "Detailed extracted information that answers the question. Include all relevant facts, figures, quotes (with attribution), dates, and statistics. Be comprehensive but focused on the question."
                        }


                        CONSTRAINTS:

                        - If you haven't found anything useful, don't make up
                        information or guess - simply say "No relevant
                        information found".

                        - Do not echo my prompt in your response

                        - Respond with a JSON object containing a single field:
                        "info"

                        - Write nothing other than the JSON

                        - Do not add information not present in the WEB PAGE
                        CONTENT

                        - Keep the style of the WEB PAGE CONTENT


                        Analyze the following WEB PAGE CONTENT to extract all
                        information relevant to MAIN QUESTION:


                        WEB PAGE CONTENT:

                        {{content}}
                      param: data
                      type: gpt
                      isolated: true
                      silent: true
                      label: ENHANCED WEB CONTENT EXTRACTOR
                    - index: first
                      type: calc
                      func: extract-json
                      to: data
                      param: data
                    - param: data.url
                      format: auto
                      type: calc
                      func: set
                      value: '{{item.url}}'
                    - param: data.title
                      value: '{{item.title}}'
                      type: calc
                      func: set
                      format: auto
                    - list: array
                      func: list-add
                      index: last
                      type: calc
                      item: data
                    - type: jump
                      to: SAY STATUS
                  condition: '{{content}} =~ ^[\s\S]{1000,}$'
                  label: FETCHED
                  type: group
              label: IS IT REDDIT?
              condition: >-
                {{hostname}} =~
                ^(?!.*(reddit\.com|redd\.it)).*(?:https?:\/\/|www\.)?[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}.*$
              type: group
            - steps:
                - args: item.url
                  code: >-
                    // Get URL from args parameter

                    const url = args['item.url'];


                    if (!url) {
                      return false;
                    }


                    const fullUrl = url.startsWith('http') ? url : 'https://' +
                    url;


                    window.location.href = fullUrl;

                    return true;
                  param: navigate.boolean
                  onFailure: ABORT NAVIGATE
                  label: NAVIGATE
                  type: js
                  timeout: 15000
                  silent: true
                - type: wait
                  for: idle
                  timeout: 6000
                  silent: true
                - type: wait
                  for: 2s
                  silent: true
                  label: ABORT NAVIGATE
                - steps:
                    - param: setThread
                      value: '{{thread}}'
                      label: SET CONTENT
                      type: calc
                      func: set
                      format: ''
                    - code: |-
                        // For setThread
                        const text = args['setThread'];
                        const tokenCount = Math.ceil(text.length / 4);

                        return {
                          chars: text.length,
                          estimatedTokens: tokenCount,
                          estimatedWords: Math.ceil(tokenCount * 0.75)
                        };
                      onFailure: SOCIAL-MEDIA PROMPT
                      type: js
                      args: setThread
                      param: count
                      timeout: 15000
                      silent: true
                      label: COUNT
                    - prompt: >-
                        # Enhanced Web Content Extractor


                        You are a precise information extraction specialist.


                        OUTPUT LANGUAGE: LANGUAGE BASED ON WEB PAGE CONTENT

                        MAIN QUESTION:

                        {{instruction}}


                        TASK: Imagine you are a Google Search engine, collecting
                        relevant information from various websites to form an
                        answer to my QUESTION.


                        EXTRACTION INSTRUCTIONS:

                        - Focus on providing detailed factual information,
                        statistics, and specific examples when available

                        - Gather all information that will be useful in
                        answering my question, including context, numbers, dates
                        and concrete details

                        - Extract as much useful information as possible from
                        the WEB PAGE CONTENT

                        - Avoid general phrases

                        - When referring to someone's opinion about something,
                        mention the author. Quote if appropriate

                        - Include supporting details and contextual information 

                        - Capture all numerical data and specific examples


                        FORMAT: Respond ONLY with this JSON structure:


                        {
                          "info": "Detailed extracted information that answers the question. Include all relevant facts, figures, quotes (with attribution), dates, and statistics. Be comprehensive but focused on the question."
                        }


                        CONSTRAINTS:

                        - If you haven't found anything useful, don't make up
                        information or guess - simply say "No relevant
                        information found".

                        - Do not echo my prompt in your response

                        - Respond with a JSON object containing a single field:
                        "info"

                        - Write nothing other than the JSON

                        - Do not add information not present in the WEB PAGE
                        CONTENT

                        - Keep the style of the WEB PAGE CONTENT


                        Analyze the following WEB PAGE CONTENT to extract all
                        information relevant to MAIN QUESTION:


                        WEB PAGE CONTENT:

                        {{setThread}}
                      type: gpt
                      isolated: true
                      param: data
                      silent: true
                      label: SOCIAL-MEDIA PROMPT
                  label: SOCIAL-MEDIA
                  condition: >-
                    {{hostname}} =~
                    ^(?:reddit|facebook|twitter|telegram|discord|whatsapp)\.com$
                  type: group
                - steps:
                    - param: setPageThread
                      value: '{{page}} {{thread}}'
                      type: calc
                      func: set
                      format: ''
                      label: SET CONTENT
                    - code: |-
                        // For setPageThread
                        const text = args['setPageThread'];
                        const tokenCount = Math.ceil(text.length / 4);

                        return {
                         chars: text.length,
                         estimatedTokens: tokenCount,
                         estimatedWords: Math.ceil(tokenCount * 0.75)
                        };
                      onFailure: NOT-SOCIAL-MEDIA PROMPT
                      type: js
                      args: setPageThread
                      param: count
                      timeout: 15000
                      silent: true
                      label: COUNT
                    - prompt: >-
                        # Enhanced Web Content Extractor


                        You are a precise information extraction specialist.


                        OUTPUT LANGUAGE: LANGUAGE BASED ON WEB PAGE CONTENT

                        MAIN QUESTION:

                        {{instruction}}


                        TASK: Imagine you are a Google Search engine, collecting
                        relevant information from various websites to form an
                        answer to my QUESTION.


                        EXTRACTION INSTRUCTIONS:

                        - Focus on providing detailed factual information,
                        statistics, and specific examples when available

                        - Gather all information that will be useful in
                        answering my question, including context, numbers, dates
                        and concrete details

                        - Extract as much useful information as possible from
                        the WEB PAGE CONTENT

                        - Avoid general phrases

                        - When referring to someone's opinion about something,
                        mention the author. Quote if appropriate

                        - Include supporting details and contextual information 

                        - Capture all numerical data and specific examples


                        FORMAT: Respond ONLY with this JSON structure:


                        {
                          "info": "Detailed extracted information that answers the question. Include all relevant facts, figures, quotes (with attribution), dates, and statistics. Be comprehensive but focused on the question."
                        }


                        CONSTRAINTS:

                        - If you haven't found anything useful, don't make up
                        information or guess - simply say "No relevant
                        information found".

                        - Do not echo my prompt in your response

                        - Respond with a JSON object containing a single field:
                        "info"

                        - Write nothing other than the JSON

                        - Do not add information not present in the WEB PAGE
                        CONTENT

                        - Keep the style of the WEB PAGE CONTENT


                        Analyze the following WEB PAGE CONTENT to extract all
                        information relevant to MAIN QUESTION:


                        WEB PAGE CONTENT:

                        {{setPageThread}}
                      type: gpt
                      isolated: true
                      param: data
                      silent: true
                      label: NOT-SOCIAL-MEDIA PROMPT
                  condition: >-
                    {{hostname}} =~
                    ^(?!(?:reddit|facebook|twitter|telegram|discord|whatsapp)\.com)[a-zA-Z0-9-]+\.[a-zA-Z]+$
                  label: NOT-SOCIAL-MEDIA
                  type: group
                - type: calc
                  func: extract-json
                  to: data
                  param: data
                  index: first
                - type: calc
                  func: set
                  param: data.url
                  format: auto
                  value: '{{item.url}}'
                - type: calc
                  func: set
                  param: data.title
                  format: auto
                  value: '{{item.title}}'
                - type: calc
                  func: list-add
                  index: last
                  list: array
                  item: data
              label: NOT FETCHED
              type: group
            - message: >-
                🔍 Analyzed **{{index}} / {{total}}**, 
                [{{hostname}}]({{pageUrl}})

                - Chars: {{count.chars}}

                - Estimated Tokens: {{count.estimatedTokens}}

                - Estimated Words: {{count.estimatedWords}}
              label: ⛔️SAY STATUS
              condition: '{{index}} = {{total1}}'
              type: say
            - message: |-
                🔍 Analyzed **{{percentage}}**,  [{{hostname}}]({{pageUrl}})
                - Chars: {{count.chars}}
                - Estimated Tokens: {{count.estimatedTokens}}
                - Estimated Words: {{count.estimatedWords}}
              condition: '{{index}} != {{total}}'
              type: say
              label: SAY STATUS
            - code: |-
                const content = args['array'];
                const stringContent = JSON.stringify(content);

                return {
                  chars: stringContent.length,
                  estimatedTokens: Math.ceil(stringContent.length / 4),
                  estimatedWords: Math.ceil(stringContent.length / 4 * 0.75)
                };
              param: arrayLength
              label: ARRAY LENGTH
              type: js
              args: array
              timeout: 15000
              onFailure: SAY STATUS
              silent: true
            - type: calc
              func: set
              param: query
              format: ''
              value: '{{instruction}}'
              condition: '{{depth}} = search_by_urls'
              label: SCANNED FOR QUERY
            - message: |-
                ✅ **{{percentage}}** pages scanned for "**{{query}}**". 

                Last checked page: [{{hostname}}]({{pageUrl}})
                - Chars: {{count.chars}}
                - Estimated Tokens: {{count.estimatedTokens}}
                - Estimated Words: {{count.estimatedWords}}
                - Array Token length: **{{arrayLength.estimatedTokens}}**
              condition: '{{index}} = {{total}}'
              type: say
              label: SAY STATUS
          type: loop
          list: links
      type: loop
      list: subqueries
    - type: calc
      func: set
      param: subqueries.length
      format: ''
      value: '0'
      condition: '{{depth}} = search_by_urls'
      label: SET LENGTH TO 0 IF USING BY URLS
    - message: >-
        🤖 Analyzed **{{array.length}}** pages from **{{subqueries.length}}**
        Google searches. 

        Now preparing a comprehensive answer to your question.
      type: say
  label: DEPTH >= 1
  type: group
  condition: '{{depth}} >= 1'
- type: ask
  param: placeholder
  message: >-
    **Final Step**

    This is the final response stage.

    For the **best possible answer**, consider switching to a **high-quality
    model** (like Gemini 2.5 Pro etc.) via the **⋮ menu** or with **ALT + C**
    before continuing.
  options:
    - label: ✅ DONE
      value: placeholder
  default: ''
  vision:
    enabled: false
    mode: area
    hint: ''
    send: true
  optionsInvalid: false
  label: TIP FOR MODEL USAGE TIP | FINAL RESPONSE
- prompt: >-
    # Research Agent Prompt


    You are a Research AI Agent specialized in analyzing web search results to
    provide comprehensive answers.


    OUTPUT LANGUAGE: {{language}}


    TASK: Analyze the provided web search results (`INFORMATION FROM THE WEB`)
    to generate a detailed answer to the user's question (`MY QUESTION`).


    CONTEXT: The goal is to synthesize information from the provided web search
    results. Extract key details, integrate findings from multiple sources, and
    cite them accurately using the specified format.


    OUTPUT FORMAT:

    Structure your response precisely as follows:


    ## Key takeaway:

    [Provide the single most important takeaway from the web results in one
    paragraph.]


    ## Detailed answer:

    [Present a comprehensive analysis answering the question. Use bullet points
    for clarity. Integrate information from multiple sources. Cite every piece
    of information using fully rendered markdown links like [➊](URL), [➋](URL),
    etc., corresponding to the provided web results, placed directly within or
    at the end of relevant bullet points.]


    CONSTRAINTS:

    -   Base the answer *only* on the `INFORMATION FROM THE WEB` provided in
    `INFORMATION FROM THE WEB`.

    -   Extract and synthesize as much relevant information as possible from the
    web search results.

    -   Integrate insights from multiple web search sources.

    -   Ensure inclusion of relevant links in a fully rendered markdown format:
    `[SYMBOL](URL)` (e.g., [➊](URL), [➋](URL)) to each bullet point where the
    source is used.

    -   Each unique URL must correspond to only one symbol: ➊ ➋ ➌ ➍ ➎ ➏ ➐ ➑ ➒ ➓.

    -   If multiple sources repeat the same information, describe it and cite
    all applicable sources together (e.g., [➊](URL)[➌](URL)[➎](URL)).

    -   Do *not* hallucinate facts or information.

    -   Do not use any tools other than analyzing the provided `INFORMATION FROM
    THE WEB`.

    -   Avoid general phrases; be specific and detailed.

    -   Adhere strictly to the "Key takeaway" and "Detailed answer" structure.


    MY QUESTION:

    {{question}}


    INFORMATION FROM THE WEB:

    {{array}}


    COMPREHENSIVE RESPONSE WITH SOURCE LINKS ([➊](URL)):
  label: FINAL ANSWER
  type: gpt
  isolated: true
  param: ''
  dumb: false
- type: js
  args: array
  code: |-
    function formatMarkdownLinks(args) {
      // Hole die Daten aus den Argumenten
      let data = args.array;
      
      // Wenn data ein String ist und nach JSON aussieht, versuche es zu parsen
      if (typeof data === 'string' && (data.trim().startsWith('[') || data.trim().startsWith('{'))) {
        try {
          data = JSON.parse(data);
        } catch (e) {
          // Kein gültiges JSON, mit Textverarbeitung fortfahren
        }
      }
      
      // Prüfe, ob wir ein Array von Objekten mit URL-Eigenschaft haben
      if (Array.isArray(data) && data.length > 0 && typeof data[0] === 'object' && data[0].url) {
        // Kreiszahlen für die Formatierung
        const circleNumbers = "➊➋➌➍➎➏➐➑➒➓⓫⓬⓭⓮⓯⓰⓱⓲⓳⓴㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵".split("");
        
        // Array für eindeutige Links
        const uniqueLinks = [];
        
        // Extrahiere URLs aus jedem Objekt im Array
        data.forEach((item, index) => {
          if (item.url && index < circleNumbers.length) {
            try {
              const url = item.url;
              const urlObj = new URL(url);
              const fullPath = urlObj.hostname.replace('www.', '') + urlObj.pathname;
              const shortPath = fullPath.length > 42 
                ? fullPath.slice(0, 42) + '...'
                : fullPath.replace(/\/$/, '');
              
              // Füge den formatierten Link zum uniqueLinks-Array hinzu
              uniqueLinks.push(`[${circleNumbers[index]}](${url}) [${shortPath}](${url})`);
            } catch (e) {
              // Ungültige URLs überspringen
            }
          }
        });
        
        // Gib die formatierten Links zurück oder eine Nachricht, wenn keine gefunden wurden
        return uniqueLinks.length > 0 
          ? uniqueLinks.join('\n') 
          : "Keine URLs in den Daten gefunden.";
      }
      
      // Wenn es sich nicht um ein JSON-Array von Objekten mit URLs handelt, 
      // verwende die ursprüngliche Funktion
      const text = typeof data === 'string' ? data : JSON.stringify(data);
      
      if (!text) return "Kein Text zum Extrahieren von Links bereitgestellt.";
      
      // Regex pattern für Links mit ➊, ➋, ➌ etc.
      const linkPattern = /\[(➊|➋|➌|➍|➎|➏|➐|➑|➒|➓|⓫|⓬|⓭|⓮|⓯|⓰|⓱|⓲|⓳|⓴|㉑|㉒|㉓|㉔|㉕|㉖|㉗|㉘|㉙|㉚|㉛|㉜|㉝|㉞|㉟|㊱|㊲|㊳|㊴|㊵)\]\(([^)]+)\)/g;
      
      // Erweiterte Liste von Kreiszahlen
      const circleNumbers = "➊➋➌➍➎➏➐➑➒➓⓫⓬⓭⓮⓯⓰⓱⓲⓳⓴㉑㉒㉓㉔㉕㉖㉗㉘㉙㉚㉛㉜㉝㉞㉟㊱㊲㊳㊴㊵".split("");
      
      // Map zur Speicherung eindeutiger Links
      const uniqueLinks = new Map();
      
      // Alle Übereinstimmungen finden
      let match;
      let currentIndex = 0;
      while ((match = linkPattern.exec(text)) !== null) {
        try {
          const symbol = match[1];
          const url = match[2];
          
          // URL-Objekt erstellen zur Validierung und Analyse
          const urlObj = new URL(url);
          
          // Domain und Pfad extrahieren (ohne www.)
          const fullPath = urlObj.hostname.replace('www.', '') + urlObj.pathname;
          
          // Kürzen, wenn länger als 42 Zeichen
          const shortPath = fullPath.length > 42 
            ? fullPath.slice(0, 42) + '...'
            : fullPath.replace(/\/$/, '');
          
          // Erstelle einen Key, der Symbol und URL kombiniert
          const key = `${symbol}:${url}`;
          
          // Jede Übereinstimmung wird mit einem eindeutigen Symbol und URL-Kombination hinzugefügt
          if (!uniqueLinks.has(key)) {
            uniqueLinks.set(key, `[${symbol}](${url}) [${shortPath}](${url})`);
            currentIndex++;
          }
        } catch (e) {
          // Ungültige URLs überspringen
        }
      }
      
      // Wenn keine formatierten Links gefunden wurden, nach einfachen URLs suchen
      if (uniqueLinks.size === 0) {
        const urlPattern = /(https?:\/\/[^\s\)\]\'",:;]+)/g;
        let urlMatch;
        currentIndex = 0;
        
        while ((urlMatch = urlPattern.exec(text)) !== null) {
          try {
            const url = urlMatch[1].replace(/[.,;:?!]+$/, ''); // URL bereinigen
            
            // URL-Objekt erstellen zur Validierung und Analyse
            const urlObj = new URL(url);
            
            // Domain und Pfad extrahieren (ohne www.)
            const fullPath = urlObj.hostname.replace('www.', '') + urlObj.pathname;
            
            // Kürzen, wenn länger als 42 Zeichen
            const shortPath = fullPath.length > 42 
              ? fullPath.slice(0, 42) + '...'
              : fullPath.replace(/\/$/, '');
            
            // Stoppe, wenn wir mehr URLs haben als Kreiszahlen
            if (currentIndex >= circleNumbers.length) break;
            
            // Jede URL wird mit einem eigenen Symbol hinzugefügt
            const key = `${currentIndex}:${url}`;
            uniqueLinks.set(key, `[${circleNumbers[currentIndex]}](${url}) [${shortPath}](${url})`);
            currentIndex++;
          } catch (e) {
            // Ungültige URLs überspringen
          }
        }
      }
      
      // Formatierte Ausgabe zurückgeben oder eine Nachricht, wenn keine Links gefunden wurden
      return uniqueLinks.size > 0 
        ? Array.from(uniqueLinks.values()).join('\n') 
        : "Keine nummerierten Referenzlinks im Text gefunden.";
    }

    return formatMarkdownLinks(args);
  param: sources
  timeout: 15000
  onFailure: DISPLAY SOURCES
  label: EXTRACT SOURCES
  silent: true
- type: say
  message: |-
    ![icon](/img/commands/maps-globe-06.svg) Sources

    {{sources}}
  label: DISPLAY SOURCES
Notice: Please read before using
This automation command is created by a community member. HARPA AI team does not audit community commands.
Please review the command carefully and only install if you trust the creator.
Designed and engineered in Finland 🇫🇮
Home Use Cases Guides Privacy Policy Terms of Service
CAN WE STORE COOKIES?
Our website uses cookies for the purposes of accessibility and security. They also allow us to gather statistics in order to improve the website for you. More info: Privacy Policy
⚡ Custom Deep Research Bot | v2.0

Performs in-depth online research using your custom Google queries to answer your questions. First, create up to 4 specific Google searches to gather information, then ask your actual question to get a focused AI response based on those search results

How to Use

Content