GenAI

GenAI

694 bookmarks
Newest
The Problem with Reasoners
The Problem with Reasoners
A new tool that blends your everyday work apps into one. It's the all-in-one workspace for you and your team
·aidanmclaughlin.notion.site·
The Problem with Reasoners
How to Count Tokens - Tokenization With Tiktoken.
How to Count Tokens - Tokenization With Tiktoken.
Counting tokens is a useful task in natural language processing (NLP) that allows us to measure the length and complexity of a text. The two important use cases for counting the tokens are: controlling the length of the prompt - models has limit …
·safjan.com·
How to Count Tokens - Tokenization With Tiktoken.
A Multi-Agent Framework for Synthetic Data Generation
A Multi-Agent Framework for Synthetic Data Generation
Presents MAG-V, a multi-agent framework that first generates a dataset of questions that mimic customer queries. It then reverse engineer alternate questions from responses to verify agent trajectories. Reports that the… — elvis (@omarsar0)
·x.com·
A Multi-Agent Framework for Synthetic Data Generation
Agentless is a great example of how a more constrained agent is better than a general agent for specific tasks 💡 - it achieves much higher scores on SWE-Bench Lite for bug-fixing than other agent approaches 🛠️
Agentless is a great example of how a more constrained agent is better than a general agent for specific tasks 💡 - it achieves much higher scores on SWE-Bench Lite for bug-fixing than other agent approaches 🛠️
The whole point is to not let the agent do everything, but to do a… — Jerry Liu (@jerryjliu0)
·x.com·
Agentless is a great example of how a more constrained agent is better than a general agent for specific tasks 💡 - it achieves much higher scores on SWE-Bench Lite for bug-fixing than other agent approaches 🛠️
A Hierarchical Feature Extraction Model for Multi-Label Mechanical Patent Classification
A Hierarchical Feature Extraction Model for Multi-Label Mechanical Patent Classification
Various studies have focused on feature extraction methods for automatic patent classification in recent years. However, most of these approaches are based on the knowledge from experts in related domains. Here we propose a hierarchical feature extraction model (HFEM) for multi-label mechanical patent classification, which is able to capture both local features of phrases as well as global and temporal semantics. First, a n-gram feature extractor based on convolutional neural networks (CNNs) is designed to extract salient local lexical-level features. Next, a long dependency feature extraction model based on the bidirectional long–short-term memory (BiLSTM) neural network model is proposed to capture sequential correlations from higher-level sequence representations. Then the HFEM algorithm and its hierarchical feature extraction architecture are detailed. We establish the training, validation and test datasets, containing 72,532, 18,133, and 2679 mechanical patent documents, respectively, and then check the performance of HFEMs. Finally, we compared the results of the proposed HFEM and three other single neural network models, namely CNN, long–short-term memory (LSTM), and BiLSTM. The experimental results indicate that our proposed HFEM outperforms the other compared models in both precision and recall.
·mdpi.com·
A Hierarchical Feature Extraction Model for Multi-Label Mechanical Patent Classification
DAIR.AI
DAIR.AI
Learn important prompt engineering techniques to build use cases with LLMs.
·dair-ai.thinkific.com·
DAIR.AI
Fundamental Research on Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach
Fundamental Research on Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach
Requirements documents can contain several thousand individual requirements. They must be error-free to avoid unnecessary complications and costs in the later product development stages. An important part of this is to identify contradictions between two requirements. The first step is therefore to define what contradictions are and in what form they can occur in requirement documents. In this paper the scientific theories regarding contradictions are discussed, concerning to their usefulness for the topic. In doing so, the Aristotelian Logic proved to provide the best basis for an application in the Requirements Engineering context. Based on this theory, we have created specific subtypes of contradictions to match them to the requirements engineering field. The identification of these subtypes is done by a formalization of the requirement sentences and a subsequent analysis by means of simple questions. To validate the method, industrial requirement documents were searched for contradictions. For each detected type of contradiction, we present an example of the detection process. Thereby, we show that the method is easy to apply and may also be used by non-specialists. Thus, our method provides a taxonomy as a basis for further research on automated contradiction detection as well as on automated quality analysis of requirements documents.
·mdpi.com·
Fundamental Research on Detecting Contradictions in Requirements: Taxonomy and Semi-Automated Approach
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | AI Research Paper Details
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | AI Research Paper Details
In recent times, large language models (LLMs) have shown impressive performance on various document-level tasks such as document classification, summarization, and question-answering. However, research on understanding their capabilities on the task of self-contradictions in long documents has been very limited. In this work, we introduce ContraDoc, the first human-annotated dataset to study self-contradictions in long documents across multiple domains, varying document lengths, self-contradictions types, and scope. We then analyze the current capabilities of four state-of-the-art open-source and commercially available LLMs: GPT3.5, GPT4, PaLM2, and LLaMAv2 on this dataset. While GPT4 performs the best and can outperform humans on this task, we find that it is still unreliable and struggles with self-contradictions that require more nuance and context. We release the dataset and all the code associated with the experiments (https://github.com/ddhruvkr/CONTRADOC).
·aimodels.fyi·
ContraDoc: Understanding Self-Contradictions in Documents with Large Language Models | AI Research Paper Details
LLM-powered data classification for data entities at scale
LLM-powered data classification for data entities at scale
With the advent of the Large Language Model (LLM), new possibilities dawned for metadata generation and sensitive data identification at Grab. This prompted the inception of our project aimed to integrate LLM classification into our existing data management service. Read to find out how we transformed what used to be a tedious and painstaking process to a highly efficient system and how it has empowered the teams across the organisation.
·engineering.grab.com·
LLM-powered data classification for data entities at scale