Specification
- avro - (forks: 1066) (stars: 1594) (watchers: 1594) - apache avro is a data serialization system.
Registries
- schema registry - (forks: 736) (stars: 1234) (watchers: 1234) - confluent schema registry for kafka
- schema registry ui - (forks: 88) (stars: 321) (watchers: 321) - web tool for avro schema registry |
- schemer - (forks: 3) (stars: 90) (watchers: 90) - schema registry for csv, tsv, json, avro and parquet schema. supports schema inference and graphql api.
Queries
- rq - (forks: 45) (stars: 1553) (watchers: 1553) - record query - a tool for doing record analysis and transformation
Education
- examples - (forks: 458) (stars: 670) (watchers: 670) - apache kafka and confluent platform examples and demos
- kafka storm starter - (forks: 335) (stars: 726) (watchers: 726) - code examples that show to integrate apache kafka 0.8+ with apache storm 0.9+ and apache spark streaming 1.1+, while using apache avro as the data serialization format.
- avro hadoop starter - (forks: 86) (stars: 111) (watchers: 111) - example mapreduce jobs in java, hive, pig, and hadoop streaming that work on avro data.
- Avro2TF - (forks: 19) (stars: 118) (watchers: 118) - avro2tf is designed to fill the gap of making users' training data ready to be consumed by deep learning training frameworks.
Serialization
- avsc - (forks: 98) (stars: 844) (watchers: 844) - avro for javascript :zap:
- avro4s - (forks: 178) (stars: 536) (watchers: 536) - avro schema generation and serialization / deserialization for scala
- fastavro - (forks: 115) (stars: 362) (watchers: 362) - fast avro for python
- gogen avro - (forks: 66) (stars: 191) (watchers: 191) - generate go code to serialize and deserialize avro schemas
- avrohugger - (forks: 82) (stars: 147) (watchers: 147) - generate scala case class definitions from avro schemas
- scalavro - (forks: 31) (stars: 119) (watchers: 119) - a reflection-based avro library in scala.
- abracad - (forks: 31) (stars: 107) (watchers: 107) - a clojure library for de/serializing clojure data structures with avro.
- python avro json serializ - (forks: 32) (stars: 104) (watchers: 104) - serializes data into a json format using avro schema.
- avro_turf - (forks: 44) (stars: 97) (watchers: 97) - a library that makes it easier to use the avro serialization format from ruby.
- avro rs - (forks: 48) (stars: 89) (watchers: 89) - avro client library implementation in rust
- json schema avro - (forks: 22) (stars: 102) (watchers: 102) - avro to json schema, and back
- jsAvroPhonetic - (forks: 56) (stars: 84) (watchers: 84) - a javascript implementation of avro phonetic
- kafka avro - (forks: 34) (stars: 76) (watchers: 76) - node.js bindings for librdkafka with avro schema serialization.
- pyavroc - (forks: 17) (stars: 46) (watchers: 46) - an avro file reader/writer for python
- BlueSteel - (forks: 15) (stars: 47) (watchers: 47) - an avro encoding/decoding library for swift.
- libserdes - (forks: 35) (stars: 36) (watchers: 36) - avro serialization/deserialization c/c++ library with confluent schema-registry support
- vulcan - (forks: 8) (stars: 46) (watchers: 46) - functional avro for scala
- avro schema - (forks: 2) (stars: 48) (watchers: 48) - apache avro schema tools for tarantool
Generators
- xml avro - (forks: 56) (stars: 58) (watchers: 58) - generate avro schema and avro binary from xsd schema and xml
Connectors
- spark avro - (forks: 316) (stars: 535) (watchers: 535) - avro data source for apache spark
- cpp serializers - (forks: 82) (stars: 484) (watchers: 484) - benchmark comparing various data serialization libraries (thrift, protobuf etc.) for c++
Code Generation
- gradle avro plugin - (forks: 53) (stars: 135) (watchers: 135) - a gradle plugin to allow easily performing java code generation for apache avro. it supports json schema declaration files, json protocol declaration files, and avro idl files.
- sbt avrohugger - (forks: 37) (stars: 95) (watchers: 95) - sbt plugin for generating scala sources for apache avro schemas and protocols.
- avromatic - (forks: 11) (stars: 56) (watchers: 56) - generate ruby models from avro schemas
Tabular
- iceberg - (forks: 48) (stars: 363) (watchers: 363) - iceberg is a table format for large, slow-moving tabular data
Toolchains
- DevOps Python tools - (forks: 152) (stars: 310) (watchers: 310) - 80+ devops & data cli tools - aws, log anonymizer, spark, hadoop, hbase, hive, impala, linux, docker, spark data converters & validators (avro/parquet/json/csv/ini/xml/yaml), travis ci, ambari, blueprints, cloudformation, elasticsearch, solr, pig, ipython - python / jython tools
- bigdata playground - (forks: 54) (stars: 157) (watchers: 157) - a complete example of a big data application using : kubernetes (kops/aws), apache spark sql/streaming/mlib, apache flink, scala, python, apache kafka, apache hbase, apache parquet, apache avro, apache storm, twitter api, mongodb, nodejs, angular, graphql
Data Store
- chana - (forks: 50) (stars: 332) (watchers: 332) - avro data store based on akka
Data Generation
- ratatool - (forks: 45) (stars: 251) (watchers: 251) - a tool for data sampling, data generation, and data diffing
Conversion
- json wikipedia - (forks: 41) (stars: 241) (watchers: 241) - json wikipedia, contains code to convert the wikipedia xml dump into a json/avro dump
- json avro converter - (forks: 60) (stars: 158) (watchers: 158) - json to avro conversion tool designed to make migration to avro easier.
Database
- storagetapper - (forks: 46) (stars: 205) (watchers: 205) - storagetapper is a scalable realtime mysql change data streaming, logical backup and logical replication service
Binary
- jackson dataformats binar - (forks: 67) (stars: 187) (watchers: 187) - uber-project for standard jackson binary format backends: avro, cbor, protobuf, smile
IDE
- vscode data preview - (forks: 20) (stars: 168) (watchers: 168) - data preview 🈸 extension for importing 📤 viewing 🔎 slicing 🔪 dicing 🎲 charting 📊 & exporting 📥 large json array/config, yaml, apache arrow, avro & excel data files
Documentation
- avrodoc - (forks: 60) (stars: 121) (watchers: 121) - documentation tool for avro schemas
Validation
- aptos - (forks: 16) (stars: 141) (watchers: 141) - :sunny: a tool for validating data using json schema and converting json schema documents into different data-interchange formats
Command Line Interface
- schema registry - (forks: 24) (stars: 96) (watchers: 96) - a cli and go client for kafka schema registry
Semantics
- schema_salad - (forks: 33) (stars: 40) (watchers: 40) - semantic annotations for linked avro data
Like JSON Schema, Avro is a very data centric specification. I need to better understand how it is used by leading providers like Confluent for powering Kafka, but I also want to better understand its relationship to JSON Schema, and how it is used for AsyncAPI and OpenAPI. This dive provided me with a fresh look at how the API space is evolving, and also how data and our databases are still king when it comes to everything API.