GitHub
Discover and manage GitHub repositories and hosting services in your Devgraph knowledge graph.
Overview
The GitHub molecule connects to GitHub's API to automatically discover repositories within your organizations. It creates entities for repos and the GitHub hosting service itself, along with relationships between them.
Key Features:
- Discover repositories from multiple organizations
- Filter repos by name patterns (regex)
- Support for GitHub App authentication (3x higher rate limits)
- Read
.devgraph.yamlfiles from repos for additional entities - Track repository metadata (languages, description, etc.)
Quick Start
providers:
- name: github
type: github
every: 300 # Reconcile every 5 minutes
config:
namespace: default
token: ${GITHUB_TOKEN}
selectors:
- organization: myorg
repo_name: ".*" # All repos
Configuration
Basic Configuration
providers:
- name: github-prod
type: github
every: 300
config:
namespace: production
base_url: https://github.com
api_url: https://api.github.com
token: ${GITHUB_TOKEN}
selectors:
- organization: mycompany
repo_name: "backend-.*"
graph_files:
- .devgraph.yaml
- organization: mycompany
repo_name: "frontend-.*"
GitHub App Configuration (Recommended)
GitHub Apps provide 3x higher rate limits (15,000/hour vs 5,000/hour with PAT):
providers:
- name: github-app
type: github
every: 300
config:
namespace: default
app_id: 123456
app_private_key: ${GITHUB_APP_PRIVATE_KEY}
installation_id: 789012
selectors:
- organization: myorg
Configuration Options
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
namespace | string | No | default | Namespace for created entities |
base_url | string | No | https://github.com | GitHub web interface URL |
api_url | string | No | https://api.github.com | GitHub API endpoint |
token | string | Yes* | - | Personal Access Token |
app_id | integer | Yes* | - | GitHub App ID |
app_private_key | string | Yes* | - | GitHub App private key |
installation_id | integer | Yes* | - | GitHub App installation ID |
selectors | list | Yes | [] | Repository selectors |
*Either token OR (app_id + app_private_key + installation_id) is required.
Selector Options
Each selector specifies which repos to discover:
| Option | Type | Required | Default | Description |
|---|---|---|---|---|
organization | string | Yes | - | GitHub organization name |
repo_name | string | No | .* | Regex pattern for repo names |
graph_files | list | No | [.devgraph.yaml] | Files to parse for entities |
Authentication
Personal Access Token (PAT)
Rate Limit: 5,000 requests/hour
- Go to GitHub Settings → Developer settings → Personal access tokens → Tokens (classic)
- Click "Generate new token"
- Select scopes:
repo- For private repositoriespublic_repo- For public repositories onlyread:org- To read organization data
- Copy token and set in config:
token: ${GITHUB_TOKEN}
Set environment variable:
export GITHUB_TOKEN="ghp_xxxxxxxxxxxx"
GitHub App (Recommended)
Rate Limit: 15,000 requests/hour
- Create GitHub App:
- Go to Organization Settings → Developer settings → GitHub Apps
- Click "New GitHub App"
- Set permissions:
Repository: Contents (Read-only),Organization: Members (Read-only) - Generate private key and download
- Install app on organization
- Note the App ID and Installation ID
- Configure:
app_id: 123456
app_private_key: |
-----BEGIN RSA PRIVATE KEY-----
... (your private key) ...
-----END RSA PRIVATE KEY-----
installation_id: 789012
Or use environment variables:
export GITHUB_APP_ID="123456"
export GITHUB_APP_PRIVATE_KEY="$(cat private-key.pem)"
export GITHUB_APP_INSTALLATION_ID="789012"
Entities Created
GithubHostingService
Represents the GitHub platform instance.
Entity Structure:
apiVersion: entities.devgraph.ai/v1
kind: GithubHostingService
metadata:
name: github
namespace: default
labels:
organization: github
spec:
api_url: https://api.github.com
One hosting service entity is created per provider.
GithubRepository
Represents a GitHub repository.
Entity Structure:
apiVersion: entities.devgraph.ai/v1
kind: GithubRepository
metadata:
name: my-service
namespace: default
labels:
owner: myorg
spec:
owner: myorg
name: my-service
url: https://github.com/myorg/my-service
description: "My awesome service"
languages:
Python: 15234
JavaScript: 8432
Fields:
owner: Organization or user that owns the reponame: Repository nameurl: Web URL to the repositorydescription: Repository descriptionlanguages: Map of language names to bytes of code
Relationships
HOSTED_BY
Every repository is linked to the hosting service.
GithubRepository --HOSTED_BY--> GithubHostingService
Example:
myorg/backend-api --HOSTED_BY--> github
Graph Files
The GitHub molecule can read .devgraph.yaml files from repositories to discover additional entities and relationships. This allows repos to declare their own metadata.
Example .devgraph.yaml:
entities:
- apiVersion: entities.devgraph.ai/v1
kind: Service
metadata:
name: backend-api
labels:
team: platform
spec:
type: rest-api
port: 8080
relations:
- source:
kind: Service
name: backend-api
target:
kind: GithubRepository
name: backend-api
relation: IMPLEMENTED_BY
Configure which files to read:
selectors:
- organization: myorg
graph_files:
- .devgraph.yaml
- .devgraph/entities.yaml
Filtering Repositories
Use regex patterns to filter repos:
Match All Repos
repo_name: ".*"
Match Prefix
repo_name: "backend-.*" # backend-api, backend-worker, etc.
Match Suffix
repo_name: ".*-service" # api-service, auth-service, etc.
Exclude Pattern
repo_name: "^(?!archive-).*" # Exclude repos starting with "archive-"
Multiple Patterns
Use multiple selectors:
selectors:
- organization: myorg
repo_name: "backend-.*"
- organization: myorg
repo_name: "frontend-.*"
Use Cases
Multi-Organization Discovery
selectors:
- organization: company-backend
- organization: company-frontend
- organization: company-platform
Environment Separation
# Production
- name: github-prod
config:
namespace: production
selectors:
- organization: myorg
repo_name: "prod-.*"
# Staging
- name: github-staging
config:
namespace: staging
selectors:
- organization: myorg
repo_name: "staging-.*"
Team-Based Organization
selectors:
- organization: myorg
repo_name: "platform-.*"
graph_files: [.devgraph.yaml]
- organization: myorg
repo_name: "product-.*"
graph_files: [.devgraph.yaml]
Troubleshooting
Rate Limiting
Symptom: Logs show "GitHub API rate limit low" or 403 errors
Solutions:
- Switch to GitHub App authentication (3x higher limits)
- Increase reconciliation interval (
every: 600for 10 minutes) - Reduce number of organizations/repos
Check rate limit:
curl -H "Authorization: Bearer $GITHUB_TOKEN" \
https://api.github.com/rate_limit
Authentication Errors
Symptom: "401 Unauthorized" or "403 Forbidden"
Solutions:
- Verify token is correct and not expired
- Check token has required scopes (
repo,read:org) - For GitHub App: Verify app is installed on organization
- For GitHub App: Check private key format (must include BEGIN/END markers)
Missing Repositories
Symptom: Expected repos not appearing
Solutions:
- Check
repo_namepattern matches repo names - Verify token/app has access to repos (private repos need
reposcope) - Check organization name is correct
- Review logs for errors
Private Key Format Issues
Symptom: "Invalid private key format" error
Solution: Ensure private key has proper formatting:
app_private_key: |
-----BEGIN RSA PRIVATE KEY-----
MIIEpAIBAAKCAQEA...
...
-----END RSA PRIVATE KEY-----
Performance Tips
- Use GitHub Apps: 3x higher rate limits
- Optimize selectors: Filter at API level with specific org/repo patterns
- Adjust interval: Balance freshness vs. API usage
- Minimize graph_files: Only parse files you need
Integration Examples
Link Argo Applications to Repos
Argo molecule can automatically link applications to GitHub repos:
# In Argo app, repoURL matches GitHub repo spec.url
ArgoApplication --USES--> GithubRepository
Link Vercel Projects to Repos
Vercel molecule creates relations based on Git URLs:
VercelProject --USES--> GithubRepository
Next Steps
- Configure GitLab molecule for GitLab repositories
- Set up Argo CD molecule to link deployments to repos
- Learn about entity definitions
- Explore relationships