Salar Abbaspourazad
machine learning scientist
apple
cv g e x in

Listen

Change the vibe

DPO = -𝔼(x,yw,yl)~D[log σ(β log πθ(yw|x)πref(yw|x) - β log πθ(yl|x)πref(yl|x))] CE = -∑i[yilog ŷi + (1-yi)log(1-ŷi)] H(X) = -∑xp(x)log p(x) InfoNCE = -logexp(q·k⁺/τ)iexp(q·ki/τ) Attention(Q,K,V) = softmax(QKT√dk)V dydx = dydu × dudx DKL(P||Q) = ∑xP(x)logP(x)Q(x) P(τ) = ρ0(s0) · ∏t π(at | st) · p(rt, st+1 | st, at) softmax(xi) = exiΣjexj θ = θ - α∇J(θ) P(A|B) = P(B|A)P(A)P(B)