Skip to main content

12.01.3 Using AI for ASR and text summarization

note

Link to the scenario template

As an example of using AI nodes, let's create a scenario that will:

  • process the provided audio file;
  • generate the full text content of the file;
  • generate a brief text content of the file;
  • send a notification e-mail to the user with the brief text content of the text file.

To process the audio file and generate the text we will use the nodes whisper (preview) from the group AI: Automatic Speech Recognition and bart-large-cnn (preview) from the group AI: Summarization.

Several nodes must be added for the scenario to work successfully:

  • (1) Trigger on Webhook node to whose URL a POST request containing a file is sent;
  • (2) whisper (preview) node to process the audio file. The content of the file is the output parameter of the Trigger on Webhook node;
  • (3) Create New Document from Text node for writing the text resulting from audio file processing to Google disk. The name of the file can be anything, such as the current date and time. The text of the file is one of the output parameters of the whisper (preview) node. Authorization is required for correct operation of the node;
  • (4) The bart-large-cnn (preview) node to process the text received in the whisper (preview) node and generate its summary. The maximum length of the node's response can be adjusted in the Max Length (Integer) field;
  • (5) Send Email node for sending an email to a specified mail address with a summary text generated in bart-large-cnn (preview) node. Authorization is required for the node to work correctly;

The outputs of the scenario are:

  • a file with text generated from the provided audio file;
  • a letter summarizing the content of the audio file.