Verified FPGA acceleration

Move your hardest compute to FPGA — verified, and shipped faster.

I design, verify, and deliver production-ready FPGA accelerators for compute-heavy workloads. AI-augmented for speed, sign-off-rigorous so it's safe for silicon — every module passes synthesis and bit-exact golden-model simulation before it ships.

0DSP MAC engines
0fabric clock, timing met
0GMAC/s peak (target)
0FPGA / RTL engineering
gemm_accel · verification
module gemm_accel  // 8×8 systolic · Q1.15
ghdl  analyze ........ PASS
ghdl  elaborate ...... PASS
vivado ooc synth ..... PASS
sim vs golden ........ BIT-EXACT
timing @200 MHz ....... MET

Why teams call me

You have a compute bottleneck.
You don't have an FPGA team.

CPU/GPU paths that are too slow, too power-hungry, or too costly to scale. FPGAs can deliver an order-of-magnitude win — but design, verification, and timing closure are where projects stall. I own that, end to end.

Latency & throughput

Output-stationary systolic and streaming dataflows that keep the silicon fed and hit your cycle budget.

🔋

Power & cost at the edge

Fixed-point, resource-aware designs that fit cheap parts and run cool — no datacenter GPU required.

🛡️

Trust & sign-off

Every block verified against a golden model and closed in timing — so it works on the board, not just in a demo.

Services

Engage at the level you need.

Start with a low-risk feasibility sprint; scale up to full delivery, verification, or ongoing capacity.

Feasibility & Architecture Sprint

$3k–6k

1–2 weeks. Spec, architecture, resource & timing budget, and a clear go/no-go before you commit a dollar to development.

Best first step

Accelerator Design & Delivery

$15k+

Synthesizable, verified RTL + block design + simulation, delivered and integrated to your board.

Most popular

Verification & Sign-off

$8k–25k

Take your existing RTL to a verified, timing-closed, golden-checked, sign-off-ready state.

De-risk a build

Fractional FPGA Lead

$3k–8k/mo

Ongoing senior FPGA capacity: architecture reviews, design, and mentoring on a monthly retainer.

Ongoing

The approach

The speed of AI.
The rigor of hardware sign-off.

I run an AI-augmented RTL pipeline that generates modules fast — then gates every one through real verification before it's accepted. You don't get "AI code." You get verified designs, delivered faster because verification is automated and continuous.

01 · generate

AI-assisted RTL from a precise, human-authored spec.

02 · compile

GHDL analyze + elaborate; Vivado out-of-context synthesis.

03 · prove

Simulate bit-exact against a Python golden model — with regression gates.

04 · close

Timing closure and resource sign-off on the target part.

Nothing ships unverified. If it doesn't pass, it doesn't go in the build.

Selected work

Tiled GEMM accelerator on Zynq-7020

A dense matrix-multiply (GEMM) accelerator for a free-toolchain Zynq-7020 (PYNQ-Z2 / Zybo Z7-20). An 8×8 output-stationary systolic array of 64 DSP slices, Q1.15 fixed-point, fed from DDR over AXI3 HP ports with ping-pong double-buffering to hide memory latency.

Verified end-to-end: each module checked in synthesis and simulated bit-exact against a Python golden model with identical rounding/saturation. Targets timing at 200 MHz on a –1 speed grade — and builds in free Vivado Standard Edition, no paid license.

VHDL-2008Zynq-7020DSP48E1 ×64AXI3 / HPQ1.15MMCM · CDCGHDL + VivadoGolden-model sim
12.8GMAC/s peak @ 200 MHz (target)
64MAC PEs · 8×8 systolic
≥8×vs NEON single-core A9 (target)
$0toolchain license (Vivado Std)

Architecture, RTL, verification harness, and block-design integration — delivered with my AI-augmented pipeline.

How we work

From bottleneck to verified hardware.

01

Scope

We pin down the workload, constraints, and target part. Fixed-price, low risk.

02

Architect

Spec, dataflow, and a resource + timing budget you can trust before building.

03

Build

AI-augmented RTL implementation — fast, modular, and reviewable.

04

Verify

Synthesis + bit-exact golden simulation + timing closure. Sign-off quality.

05

Deliver

Integrated to your board, documented, with support through bring-up.

MD
Mordecha Datskovsky FPGA / RTL Engineer
ZynqDSPAXIVerification

About

Senior FPGA engineering, owned end to end.

I'm an FPGA / RTL engineer with [X]+ years building acceleration and signal-processing hardware on Zynq and AMD-Xilinx devices [add 1–2 past roles or domains]. I specialize in the hard parts teams get stuck on: systolic and streaming dataflows, fixed-point DSP, AXI memory paths, clock-domain crossing, and timing closure.

My edge is method. I pair deep hardware judgment with an AI-augmented pipeline that makes delivery fast without cutting corners on verification — because in silicon, "looks right" isn't good enough. One engineer owns your result and signs off on it.

Let's talk

FAQ

Straight answers.

Isn't AI-generated HDL risky?

Unverified HDL is risky — whoever writes it. That's the whole point of my process: generation is fast, but every module is gated through synthesis and bit-exact golden-model simulation before it's accepted. You get the speed without the risk.

What does a typical engagement look like?

Most start with a fixed-price 1–2 week feasibility sprint: I deliver a spec, architecture, and a resource/timing budget with an honest go/no-go. If we proceed, that work credits toward full design & delivery.

Which devices and tools do you work with?

AMD-Xilinx Zynq / 7-series and the Vivado + GHDL toolchain, VHDL-2008, AXI3/AXI4, DSP48, fixed-point DSP, MMCM/CDC. [Add Verilog/Intel-Quartus/etc. if relevant to you.]

Do you work remotely / internationally?

Yes — fully remote, working with teams worldwide. Engagements are scoped and priced up front so there are no surprises.

How do you price?

By outcome, not by the hour: fixed-price sprints and project quotes, or a monthly retainer for ongoing capacity. You always know the number before we start.

Get started

Have a compute bottleneck worth solving in hardware?

Book a 20-minute call, or send a line about your workload and constraints. I'll tell you honestly whether FPGA is the right move.