Analyzing GitHub Issues with Graphlit

Archana Vaidheeswaran

April 3, 2024

Building on the groundwork in our previous article, we're about to get hands-on with Graphlit, turning it from a concept into a working model that simplifies your first — or next — contribution to projects like OpenAI's Python client.

Following our introduction to the engaging world of open-source and the innovative tools that can help you navigate it, let's dive into a practical application that bridges the gap between interest and action.

In this article, we'll break down the process of creating a Graphlit feed, setting up an LLM (Large Language Model) specification, and interacting with GitHub issues to extract valuable insights.

Let’s delve into the technical steps that bring this functionality to life.

To build this, we have to do three steps:

  1. We will use Graphlit to create a feed that will ingest GitHub issues from an open-source project.

  2. We will create an specification to configure the LLM to use for publishing.

  3. Using our LLM specification, we can summarize and report on the ingested GitHub issues.

You can follow along with the sample application code on GitHub. To see a sample application like this in action, you can check out this Streamlit app. Also, you can see guided tour of the sample application here.

Creating a Graphlit Feed

A Graphlit feed is a way to bulk ingest content from a website, RSS feed, cloud storage or other locations. Feeds can also be configured to ingest content in a recurring schedule, so that any new content is also indexed by Graphlit.

We will use the createFeed mutation to create our GitHub issues feed. You will need to specify the feed type, the repository name and owner and provide a name for your feed.

mutation CreateFeed($feed: FeedInput!) {
    createFeed(feed: $feed) {
        id
        name
        state
        type
    }
}
{
    "feed": {
        "type": "ISSUE",
        "issue": {
            "type": "GIT_HUB_ISSUES",
            "github": {
                "repositoryOwner": "openai",
                "repositoryName": "openai-python",
                "personalAccessToken": "redacted",
            },
            "readLimit": 10
        },
        "name": "Python Client"
    }
}


Creating an LLM Specification

The next step is to create an LLM specification, which will help us configure our LLM parameters. Configuring parameters like the temperature and probability control how our LLM will respond to user prompts. For instance, by increasing the temperature, the LLM responses become more varied and creative, but also increases the risk of hallucinations.

Typically when building RAG applications, you will want to create LLM specifications and reuse it across conversations and publishing operations.

In the below code, we use the createSpecification mutation to create a specification for Anthropic’s latest Claude 3 Haiku model. We also set the temperature to 0.5 and probability to 0.2.

mutation CreateSpecification($specification: SpecificationInput!) {
    createSpecification(specification: $specification) {
        id
        name
        state
        type
        serviceType
    }
}
{
    "specification": {
        "type": "COMPLETION",
        "serviceType": "ANTHROPIC",
        "anthropic": {
            "model": "CLAUDE_3_HAIKU",
            "temperature": 0.5,
            "probability": 0.2,
            "completionTokenLimit": 2048
        },
        "name": "Summarization"
    }
}


Reporting on Github Issues Feed

With our GitHub Issues feed ingested and LLM specification created, we are ready to publish our report.

We can use an LLM to find recurring themes in the ingested GitHub issues and then categorize issues into those themes. This can be useful to condense many issues into a few workstreams and can be useful for new open-source contributors to get started on some issues.

First, we will need to create a publishing prompt.

Here is an example :

Write me a report of recurring themes across all GitHub issues, which can be used to group issues into workstreams.  For each theme, provide an example of issues which fall into this theme.

You can edit this, and refine the report for your needs.

Next, we will use the publishContents mutation to summarize the GitHub issues, and use the LLM to write our report.

You will need to specify the feed identifier and the LLM specification identifier.

You will also have to set the type field to TEXT and format field to MARKDOWN since we want to publish as Markdown text.

mutation PublishContents($connector: ContentPublishingConnectorInput!, $publishPrompt: String!, $summarySpecification: EntityReferenceInput, $publishSpecification: EntityReferenceInput, $filter: ContentFilter) {
  publishContents(connector: $connector, publishPrompt: $publishPrompt, summarySpecification: $summarySpecification, publishSpecification: $publishSpecification, filter: $filter) {
      id
      name
      markdown
  }
}
{
    "filter": {
        "types": [
            "ISSUE"
        ],
        "feeds": [
            { 
                "id": "your-feed-identifier"
            }
        ]
    },
    "connector": {
        "type": "TEXT",
        "format": "MARKDOWN"
    },
    "publishPrompt": "Write me a report of recurring themes across all GitHub issues, which can be used to group issues into workstreams.  For each theme, provide an example of issues which fall into this theme.",
    "summarySpecification": {
        "id": "your-specification-identifier"
    },
    "publishSpecification": {
        "id": "your-specification-identifier"
    }
}


Here is our report:


And that’s it! You have successfully ingested a GitHub Issues feed and reported on themes which can be used to group issues into workstreams.

With a Graphlit at your disposal, you’re now equipped to cut through the clutter and find your niche in any project.

Remember, tools like these are designed to enhance your skills and amplify your impact. Dive in, experiment, and watch as your contributions shape the future of technology.


Summary

Please email any questions on this article or the Graphlit Platform to questions@graphlit.com.

For more information, you can read our Graphlit Documentation, visit our marketing site, or join our Discord community.