HARPA.AI
LIBRARYAPIGUIDESAI COMMANDSBLOG

🧩  WhatsApp Messages Extraction

Extracts messages from WhatsApp conversation. Run this command while viewing a WhatsApp chat in your browser. #extraction

Created by Adrian Larsson
Updated on Nov 9, 2024 04:32
Installed 70 times
RUNS JS CODE

How to Use

IMPORT COMMAND

Content

- type: ask
  param: targetMessageCount
  options:
    - label: 50 messages
      value: 50
    - label: 100 messages
      value: 100
    - label: 200 messages
      value: 200
    - $custom
  default: ''
  vision:
    enabled: false
    mode: area
    send: true
    hint: ''
  message: >-
    📌 You can use this command or JS code as a base for creating other commands
    or automations.


    How many messages would you like to extract?
  optionsInvalid: false
- type: js
  code: |
    async function scrollAndCollectMessages(targetMessageCount) {
      // Configuration object for DOM selectors
      const config = {
        application: 'div[role="application"]',
        messageSelectors: '.message-in, .message-out', 
        textSelector: 'span.selectable-text.copyable-text',
        copyableText: '.copyable-text'
      }

      async function collectMessages() {
        // Initialize storage arrays and counters
        const messages = []
        const uniqueMessageIds = new Set()
        let retries = 0
        const maxRetries = 5
        let lastKnownAuthor = ''

        // Scroll up and wait for messages to load
        async function scrollAndWait() {
          const scrollableElement = document.querySelector(config.application).parentElement
          
          if (scrollableElement) {
            console.log('Scrolling to load more messages...')
            scrollableElement.scrollTop = 0
            // Wait for messages to render
            await new Promise(resolve => setTimeout(resolve, 1500))
          } else {
            console.error('Error: Unable to find the scrollable element.')
          }
        }

        // Extract messages from current view
        function extractCurrentMessages() {
          // Get all message elements and convert to array
          const messageElements = Array.from(document.querySelectorAll(config.messageSelectors))
          let newMessagesFound = false

          // Iterate through messages from bottom to top
          for (let i = messageElements.length - 1; i >= 0; i--) {
            const message = messageElements[i]
            
            // Extract message content and timestamp
            const contentElement = message.querySelector(config.textSelector)
            const content = contentElement?.textContent || ''
            const timeMatches = message.textContent.match(/\d{1,2}:\d{2}/)
            const time = timeMatches ? timeMatches[0] : ''
            const messageId = `${time}-${content}`

            // Skip if message already processed
            if (uniqueMessageIds.has(messageId)) continue

            // Extract author information
            const prePlainText = message.querySelector(config.copyableText)?.getAttribute('data-pre-plain-text')
            let author = prePlainText ? prePlainText.match(/] (.*?):/)?.[1] : null

            // Handle missing author case
            if (!author || author === 'Unknown') {
              author = lastKnownAuthor
            } else {
              lastKnownAuthor = author
            }

            // Store valid messages
            if (content && author) {
              uniqueMessageIds.add(messageId)
              messages.push(`${time}. ${author}: ${content}`)
              newMessagesFound = true
            }
          }

          return newMessagesFound
        }

        // Main collection loop
        while (messages.length < targetMessageCount && retries < maxRetries) {
          const newMessagesFound = extractCurrentMessages()
          console.log(`Collected ${messages.length} messages so far.`)

          // Continue scrolling if needed
          if (messages.length < targetMessageCount) {
            await scrollAndWait()
            if (!newMessagesFound) {
              retries++
            } else {
              retries = 0
            }
          }
        }

        console.log(`Total messages found: ${messages.length}`)

        // Trim excess messages if needed
        if (messages.length > targetMessageCount) {
          console.log(`Trimming results from ${messages.length} to ${targetMessageCount}`)
          messages.splice(targetMessageCount)
        }

        // Restore chronological order
        messages.reverse()
        return messages
      }

      // Main execution with error handling
      try {
        const messages = await collectMessages()
        console.log("Extracted messages:")
        messages.forEach(msg => console.log(msg))
        return messages
      } catch (error) {
        console.error("Error during extraction:", error)
        return null
      }
    }

    return scrollAndCollectMessages(targetMessageCount)
  param: messages
  timeout: 150000
  args: targetMessageCount
- type: say
  message: |-
    **Extracted Data:**

    {{messages}}
Notice: Please read before using

This automation command is created by a community member. HARPA AI team does not audit community commands.

Please review the command carefully and only install if you trust the creator.

Contact us
HomeUse CasesGuidesPrivacy PolicyTerms of Service
CAN WE STORE COOKIES?
Our website uses cookies for the purposes of accessibility and security. They also allow us to gather statistics in order to improve the website for you. More info: Privacy Policy