Apple Intelligence三层模型结构

苹果在AI上很久没有实质性进展了:
Siri多年没有进步,停止了造车项目,解散了部分AI团队。
虽然陆续低调的进行了一些AI公司收购,但没有什么可称道的成果,实在算不上有什么进展。

今年WWDC上,终于发布了AI相关的内容,一如既往的“重新定义”了AI的概念:发明了一个新词Apple Intelligence,缩写还是AI。

咱们仔细看一下这个Apple Intelligence,还是动了一些脑筋的,整体架构分了三层:
1、首先是在移动设备端,运行了一个30亿参数的小模型,处理一些简单的任务(苹果自研芯片,让小模型可以在功耗可控的情况下,及时响应这些请求)
2、如果本地模型无法处理,就将请求发送到是云端,通过苹果自己的大模型,响应用户请求
3、如果任务太复杂,苹果自家模型处理不好,则将请求发送到合作伙伴提供的大模型,比如GPT-4o等,合作伙伴会不断增加
当然,对于用户的授权,和数据隐私保护,还是做了不少工作的

这样乍一看,好像没有什么吗,就是集成了多个模型。但咱们加上一个事实后,这个事情就不这么简单了:
苹果对自己的操作系统完全可控,就让本地模型可以获取比竞争对手高的多的权限。
苹果自家模型,可以读邮件、可以看日程、可以访问通讯记录、可以查看网页浏览记录,可以搜集全部图像。。。
也就是说,苹果的自家模型,可以高效收集客户设备上所有信息。
同样的,苹果自家模型,可以调用用户设备全部的功能,包括第三方APP的功能。
通过整合这些信息,就可以让苹果自家模型,吊打全部竞争对手。

细思极恐,在移动小模型上,在IOS设备上,几乎已经没有了任何生存空间。
如果Google也在安卓上,部署自己的小模型,那安卓设备上的机会,也就不存在了。
无论Google如何选择,国内厂商必然快速跟进,那手机小模型这个赛道很快就不存在了。
而第三方的移动小模型和应用,无论如何努力,由于无法控制操作系统底层,几乎不可能形成任何竞争优势,几乎必然出局。

可以看下,现在国内大模型赛道整体太卷了,小厂商几乎没有机会:
1、大模型的研发、训练,需要大量的资金、人员、算力、数据的投入,小厂玩不起,大厂不赚钱
2、开源大模型的性能,比闭源大模型并不差太多,而且也在疯狂迭代,没有商业模式,更没有资本愿意长期投入,小厂更玩不起
3、小厂在垂直赛道可能会有些机会,但如果市场足够大,被大厂嗅到,没有赚钱途径的大厂一定会下场卷死你
4、移动端小模型,上面也说了,没有操作系统权限,小厂几乎没有机会了
5、在APP创新上,国内互联网流量过于集中,应用开发出来只能依附于几个大流量平台。这些平台不会允许某几个应用过热,而且在有了热度后,大厂还会无良的抄小厂的作业,让某类APP瞬间消失

所以很可惜,虽然大家都知道大模型是个好东西。但国内环境太卷了:
没有给小厂的生态位,没有好的生态
就不会有大量的创新,后面难以出现百花齐放的场景
到头来,还是要等别人创新后,大厂去抄?
大家都懂,但停不下来。
卷来卷去,难有赢家。

好像扯远了。。。
其实,对于苹果,其实还有两个事情做的挺到位的
1、将prompt屏蔽了,让普通人可以更便捷的使用AI
2、再次发挥,强大的整合能力,提前抢占了移动AI的入口

当然,对于个人来说,用好大模型,提高自己获取知识的速度,提升自己的认知圈,扩展自己的能力边界,还是很重要的。

GoLang实现跨平台的一些技巧03

以新建文件为例,对比一下几个常见平台的区别。

继续看下MacOS平台的代码:

// os/file.go

// 新建文件
func Create(name string) (*File, error) {
	// 跳转到下面的OpenFile
	return OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666)
}

// OpenFile在这里还是平台无关的代码
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
	testlog.Open(name)
	// 从openFileNolog开始,不同平台代码会有不同
	f, err := openFileNolog(name, flag, perm)
	if err != nil {
		return nil, err
	}
	f.appendMode = flag&O_APPEND != 0

	return f, nil
}
// os/file_unix.go

// openFileNolog的unix实现
func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
	setSticky := false
	if !supportsCreateWithStickyBit && flag&O_CREATE != 0 && perm&ModeSticky != 0 {
		if _, err := Stat(name); IsNotExist(err) {
			setSticky = true
		}
	}

	var r int
	var s poll.SysFile
	for {
		var e error
		//跳转到open
		r, s, e = open(name, flag|syscall.O_CLOEXEC, syscallMode(perm))
		if e == nil {
			break
		}

		// We have to check EINTR here, per issues 11180 and 39237.
		if e == syscall.EINTR {
			continue
		}

		return nil, &PathError{Op: "open", Path: name, Err: e}
	}

	// open(2) itself won't handle the sticky bit on *BSD and Solaris
	if setSticky {
		setStickyBit(name)
	}

	// There's a race here with fork/exec, which we are
	// content to live with. See ../syscall/exec_unix.go.
	if !supportsCloseOnExec {
		syscall.CloseOnExec(r)
	}

	kind := kindOpenFile
	if unix.HasNonblockFlag(flag) {
		kind = kindNonBlock
	}

	// 封装为File结构
	f := newFile(r, name, kind)
	f.pfd.SysFile = s
	return f, nil
}
// os/file_open_unix.go

func open(path string, flag int, perm uint32) (int, poll.SysFile, error) {
	// 跳转到syscall.Open
	fd, err := syscall.Open(path, flag, perm)
	return fd, poll.SysFile{}, err
}
// syscall/zsyscall_darwin_amd64.go

func Open(path string, mode int, perm uint32) (fd int, err error) {
	var _p0 *byte
	_p0, err = BytePtrFromString(path)
	if err != nil {
		return
	}
	// 调用syscall
	r0, _, e1 := syscall(abi.FuncPCABI0(libc_open_trampoline), uintptr(unsafe.Pointer(_p0)), uintptr(mode), uintptr(perm))
	fd = int(r0)
	if e1 != 0 {
		err = errnoErr(e1)
	}
	return
}
// syscall/syscall_darwin.go
func syscall(fn, a1, a2, a3 uintptr) (r1, r2 uintptr, err Errno)

// internal/abi/funcpc.go
func FuncPCABI0(f interface{}) uintptr

// syscall/zsyscall_darwin_amd64.go
func libc_open_trampoline()
//go:cgo_import_dynamic libc_open open "/usr/lib/libSystem.B.dylib"

// 先通过abi.FuncPCABI0(libc_open_trampoline)先获取到open函数的地址
// 然后通过syscall调用open函数
// open函数是libc标准库中的函数,C语言定义为
int open(const char *pathname, int flags, mode_t mode);
//syscall在这里实现

//runtime/sys_darwin.go
//go:linkname syscall_syscall syscall.syscall
//go:nosplit
func syscall_syscall(fn, a1, a2, a3 uintptr) (r1, r2, err uintptr) {
	args := struct{ fn, a1, a2, a3, r1, r2, err uintptr }{fn, a1, a2, a3, r1, r2, err}
	entersyscall()
	//跳转到libcCall
	libcCall(unsafe.Pointer(abi.FuncPCABI0(syscall)), unsafe.Pointer(&args))
	exitsyscall()
	return args.r1, args.r2, args.err
}
func syscall()

// runtime/sys_libc.go
func libcCall(fn, arg unsafe.Pointer) int32 {
	// Leave caller's PC/SP/G around for traceback.
	gp := getg()
	var mp *m
	if gp != nil {
		mp = gp.m
	}
	if mp != nil && mp.libcallsp == 0 {
		mp.libcallg.set(gp)
		mp.libcallpc = getcallerpc()
		// sp must be the last, because once async cpu profiler finds
		// all three values to be non-zero, it will use them
		mp.libcallsp = getcallersp()
	} else {
		// Make sure we don't reset libcallsp. This makes
		// libcCall reentrant; We remember the g/pc/sp for the
		// first call on an M, until that libcCall instance
		// returns.  Reentrance only matters for signals, as
		// libc never calls back into Go.  The tricky case is
		// where we call libcX from an M and record g/pc/sp.
		// Before that call returns, a signal arrives on the
		// same M and the signal handling code calls another
		// libc function.  We don't want that second libcCall
		// from within the handler to be recorded, and we
		// don't want that call's completion to zero
		// libcallsp.
		// We don't need to set libcall* while we're in a sighandler
		// (even if we're not currently in libc) because we block all
		// signals while we're handling a signal. That includes the
		// profile signal, which is the one that uses the libcall* info.
		mp = nil
	}
	// 跳转到asmcgocall
	res := asmcgocall(fn, arg)
	if mp != nil {
		mp.libcallsp = 0
	}
	return res
}

// 硬件平台相关代码
// runtime/asm_arm64.s
// func asmcgocall(fn, arg unsafe.Pointer) int32
// Call fn(arg) on the scheduler stack,
// aligned appropriately for the gcc ABI.
// See cgocall.go for more details.
TEXT ·asmcgocall(SB),NOSPLIT,$0-20
	MOVD	fn+0(FP), R1
	MOVD	arg+8(FP), R0

	MOVD	RSP, R2		// save original stack pointer
	CBZ	g, nosave
	MOVD	g, R4

	// Figure out if we need to switch to m->g0 stack.
	// We get called to create new OS threads too, and those
	// come in on the m->g0 stack already. Or we might already
	// be on the m->gsignal stack.
	MOVD	g_m(g), R8
	MOVD	m_gsignal(R8), R3
	CMP	R3, g
	BEQ	nosave
	MOVD	m_g0(R8), R3
	CMP	R3, g
	BEQ	nosave

	// Switch to system stack.
	MOVD	R0, R9	// gosave_systemstack_switch<> and save_g might clobber R0
	BL	gosave_systemstack_switch<>(SB)
	MOVD	R3, g
	BL	runtime·save_g(SB)
	MOVD	(g_sched+gobuf_sp)(g), R0
	MOVD	R0, RSP
	MOVD	(g_sched+gobuf_bp)(g), R29
	MOVD	R9, R0

	// Now on a scheduling stack (a pthread-created stack).
	// Save room for two of our pointers /*, plus 32 bytes of callee
	// save area that lives on the caller stack. */
	MOVD	RSP, R13
	SUB	$16, R13
	MOVD	R13, RSP
	MOVD	R4, 0(RSP)	// save old g on stack
	MOVD	(g_stack+stack_hi)(R4), R4
	SUB	R2, R4
	MOVD	R4, 8(RSP)	// save depth in old g stack (can't just save SP, as stack might be copied during a callback)
	BL	(R1)
	MOVD	R0, R9

	// Restore g, stack pointer. R0 is errno, so don't touch it
	MOVD	0(RSP), g
	BL	runtime·save_g(SB)
	MOVD	(g_stack+stack_hi)(g), R5
	MOVD	8(RSP), R6
	SUB	R6, R5
	MOVD	R9, R0
	MOVD	R5, RSP

	MOVW	R0, ret+16(FP)
	RET

nosave:
	// Running on a system stack, perhaps even without a g.
	// Having no g can happen during thread creation or thread teardown
	// (see needm/dropm on Solaris, for example).
	// This code is like the above sequence but without saving/restoring g
	// and without worrying about the stack moving out from under us
	// (because we're on a system stack, not a goroutine stack).
	// The above code could be used directly if already on a system stack,
	// but then the only path through this code would be a rare case on Solaris.
	// Using this code for all "already on system stack" calls exercises it more,
	// which should help keep it correct.
	MOVD	RSP, R13
	SUB	$16, R13
	MOVD	R13, RSP
	MOVD	$0, R4
	MOVD	R4, 0(RSP)	// Where above code stores g, in case someone looks during debugging.
	MOVD	R2, 8(RSP)	// Save original stack pointer.
	BL	(R1)
	// Restore stack pointer.
	MOVD	8(RSP), R2
	MOVD	R2, RSP
	MOVD	R0, ret+16(FP)
	RET

// 然后回到openFileNolog中
// 在openFileNolog中,继续调用newFile,整体封装为File结构,原路返回
func newFile(fd int, name string, kind newFileKind) *File {
	f := &File{&file{
		pfd: poll.FD{
			Sysfd:         fd,
			IsStream:      true,
			ZeroReadIsEOF: true,
		},
		name:        name,
		stdoutOrErr: fd == 1 || fd == 2,
	}}

	pollable := kind == kindOpenFile || kind == kindPipe || kind == kindNonBlock

	// If the caller passed a non-blocking filedes (kindNonBlock),
	// we assume they know what they are doing so we allow it to be
	// used with kqueue.
	if kind == kindOpenFile {
		switch runtime.GOOS {
		case "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd":
			var st syscall.Stat_t
			err := ignoringEINTR(func() error {
				return syscall.Fstat(fd, &st)
			})
			typ := st.Mode & syscall.S_IFMT
			// Don't try to use kqueue with regular files on *BSDs.
			// On FreeBSD a regular file is always
			// reported as ready for writing.
			// On Dragonfly, NetBSD and OpenBSD the fd is signaled
			// only once as ready (both read and write).
			// Issue 19093.
			// Also don't add directories to the netpoller.
			if err == nil && (typ == syscall.S_IFREG || typ == syscall.S_IFDIR) {
				pollable = false
			}

			// In addition to the behavior described above for regular files,
			// on Darwin, kqueue does not work properly with fifos:
			// closing the last writer does not cause a kqueue event
			// for any readers. See issue #24164.
			if (runtime.GOOS == "darwin" || runtime.GOOS == "ios") && typ == syscall.S_IFIFO {
				pollable = false
			}
		}
	}

	clearNonBlock := false
	if pollable {
		if kind == kindNonBlock {
			// The descriptor is already in non-blocking mode.
			// We only set f.nonblock if we put the file into
			// non-blocking mode.
		} else if err := syscall.SetNonblock(fd, true); err == nil {
			f.nonblock = true
			clearNonBlock = true
		} else {
			pollable = false
		}
	}

	// An error here indicates a failure to register
	// with the netpoll system. That can happen for
	// a file descriptor that is not supported by
	// epoll/kqueue; for example, disk files on
	// Linux systems. We assume that any real error
	// will show up in later I/O.
	// We do restore the blocking behavior if it was set by us.
	if pollErr := f.pfd.Init("file", pollable); pollErr != nil && clearNonBlock {
		if err := syscall.SetNonblock(fd, false); err == nil {
			f.nonblock = false
		}
	}

	runtime.SetFinalizer(f.file, (*file).close)
	return f
}

GoLang实现跨平台的一些技巧02

以新建文件为例,对比一下几个常见平台的区别。

继续看下Linux平台的代码:

// os/file.go

// 新建文件
func Create(name string) (*File, error) {
	// 跳转到下面的OpenFile
	return OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666)
}

// OpenFile在这里还是平台无关的代码
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
	testlog.Open(name)
	// 从openFileNolog开始,不同平台代码会有不同
	f, err := openFileNolog(name, flag, perm)
	if err != nil {
		return nil, err
	}
	f.appendMode = flag&O_APPEND != 0

	return f, nil
}
// os/file_unix.go

// openFileNolog的unix实现
func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
	setSticky := false
	if !supportsCreateWithStickyBit && flag&O_CREATE != 0 && perm&ModeSticky != 0 {
		if _, err := Stat(name); IsNotExist(err) {
			setSticky = true
		}
	}

	var r int
	var s poll.SysFile
	for {
		var e error
		//跳转到open
		r, s, e = open(name, flag|syscall.O_CLOEXEC, syscallMode(perm))
		if e == nil {
			break
		}

		// We have to check EINTR here, per issues 11180 and 39237.
		if e == syscall.EINTR {
			continue
		}

		return nil, &PathError{Op: "open", Path: name, Err: e}
	}

	// open(2) itself won't handle the sticky bit on *BSD and Solaris
	if setSticky {
		setStickyBit(name)
	}

	// There's a race here with fork/exec, which we are
	// content to live with. See ../syscall/exec_unix.go.
	if !supportsCloseOnExec {
		syscall.CloseOnExec(r)
	}

	kind := kindOpenFile
	if unix.HasNonblockFlag(flag) {
		kind = kindNonBlock
	}

	// 封装为File结构
	f := newFile(r, name, kind)
	f.pfd.SysFile = s
	return f, nil
}
// os/file_open_unix.go

func open(path string, flag int, perm uint32) (int, poll.SysFile, error) {
	// 跳转到syscall.Open
	fd, err := syscall.Open(path, flag, perm)
	return fd, poll.SysFile{}, err
}
// syscall/syscall_linux.go

func Open(path string, mode int, perm uint32) (fd int, err error) {
	// 跳转到openat
	return openat(AT_FDCWD, path, mode|O_LARGEFILE, perm)
}

//sys	openat(dirfd int, path string, flags int, mode uint32) (fd int, err error)

// syscall/zsyscall_linux_amd64.go

func openat(dirfd int, path string, flags int, mode uint32) (fd int, err error) {
	var _p0 *byte
	_p0, err = BytePtrFromString(path)
	if err != nil {
		return
	}
	// 跳转到Syscall6
	r0, _, e1 := Syscall6(SYS_OPENAT, uintptr(dirfd), uintptr(unsafe.Pointer(_p0)), uintptr(flags), uintptr(mode), 0, 0)
	fd = int(r0)
	if e1 != 0 {
		err = errnoErr(e1)
	}
	return
}
// syscall/syscall_linux.go

func Syscall6(trap, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2 uintptr, err Errno) {
	runtime_entersyscall()
	// 跳转到RawSyscall6
	r1, r2, err = RawSyscall6(trap, a1, a2, a3, a4, a5, a6)
	runtime_exitsyscall()
	return
}

// N.B. RawSyscall6 is provided via linkname by runtime/internal/syscall.
//
// Errno is uintptr and thus compatible with the runtime/internal/syscall
// definition.
func RawSyscall6(trap, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2 uintptr, err Errno)

// syscall/zsysnum_linux_amd64.go
	SYS_OPENAT                 = 257

// RawSyscall6是通过汇编实现的,传入SYS_OPENAT,最终调用openat函数
// openat函数是libc标准库中的函数,C语言定义为
int openat(int dirfd, const char *pathname, int flags, mode_t mode);
// runtime/internal/syscall/asm_linux_amd64.s

// Syscall6 的实现在这里
// func Syscall6(num, a1, a2, a3, a4, a5, a6 uintptr) (r1, r2, errno uintptr)
//
// We need to convert to the syscall ABI.
//
// arg | ABIInternal | Syscall
// ---------------------------
// num | AX          | AX
// a1  | BX          | DI
// a2  | CX          | SI
// a3  | DI          | DX
// a4  | SI          | R10
// a5  | R8          | R8
// a6  | R9          | R9
//
// r1  | AX          | AX
// r2  | BX          | DX
// err | CX          | part of AX
//
// Note that this differs from "standard" ABI convention, which would pass 4th
// arg in CX, not R10.
TEXT ·Syscall6<ABIInternal>(SB),NOSPLIT,$0
	// a6 already in R9.
	// a5 already in R8.
	MOVQ	SI, R10 // a4
	MOVQ	DI, DX  // a3
	MOVQ	CX, SI  // a2
	MOVQ	BX, DI  // a1
	// num already in AX.
	SYSCALL
	CMPQ	AX, $0xfffffffffffff001
	JLS	ok
	NEGQ	AX
	MOVQ	AX, CX  // errno
	MOVQ	$-1, AX // r1
	MOVQ	$0, BX  // r2
	RET
ok:
	// r1 already in AX.
	MOVQ	DX, BX // r2
	MOVQ	$0, CX // errno
	RET

// 然后回到openFileNolog中
// 在openFileNolog中,继续调用newFile,整体封装为File结构,原路返回
func newFile(fd int, name string, kind newFileKind) *File {
	f := &File{&file{
		pfd: poll.FD{
			Sysfd:         fd,
			IsStream:      true,
			ZeroReadIsEOF: true,
		},
		name:        name,
		stdoutOrErr: fd == 1 || fd == 2,
	}}

	pollable := kind == kindOpenFile || kind == kindPipe || kind == kindNonBlock

	// If the caller passed a non-blocking filedes (kindNonBlock),
	// we assume they know what they are doing so we allow it to be
	// used with kqueue.
	if kind == kindOpenFile {
		switch runtime.GOOS {
		case "darwin", "ios", "dragonfly", "freebsd", "netbsd", "openbsd":
			var st syscall.Stat_t
			err := ignoringEINTR(func() error {
				return syscall.Fstat(fd, &st)
			})
			typ := st.Mode & syscall.S_IFMT
			// Don't try to use kqueue with regular files on *BSDs.
			// On FreeBSD a regular file is always
			// reported as ready for writing.
			// On Dragonfly, NetBSD and OpenBSD the fd is signaled
			// only once as ready (both read and write).
			// Issue 19093.
			// Also don't add directories to the netpoller.
			if err == nil && (typ == syscall.S_IFREG || typ == syscall.S_IFDIR) {
				pollable = false
			}

			// In addition to the behavior described above for regular files,
			// on Darwin, kqueue does not work properly with fifos:
			// closing the last writer does not cause a kqueue event
			// for any readers. See issue #24164.
			if (runtime.GOOS == "darwin" || runtime.GOOS == "ios") && typ == syscall.S_IFIFO {
				pollable = false
			}
		}
	}

	clearNonBlock := false
	if pollable {
		if kind == kindNonBlock {
			// The descriptor is already in non-blocking mode.
			// We only set f.nonblock if we put the file into
			// non-blocking mode.
		} else if err := syscall.SetNonblock(fd, true); err == nil {
			f.nonblock = true
			clearNonBlock = true
		} else {
			pollable = false
		}
	}

	// An error here indicates a failure to register
	// with the netpoll system. That can happen for
	// a file descriptor that is not supported by
	// epoll/kqueue; for example, disk files on
	// Linux systems. We assume that any real error
	// will show up in later I/O.
	// We do restore the blocking behavior if it was set by us.
	if pollErr := f.pfd.Init("file", pollable); pollErr != nil && clearNonBlock {
		if err := syscall.SetNonblock(fd, false); err == nil {
			f.nonblock = false
		}
	}

	runtime.SetFinalizer(f.file, (*file).close)
	return f
}

GoLang实现跨平台的一些技巧01

最近在读GoLang的源码,源码中有一些跨平台的操作,Go处理的很有意思,在这整理一下。

以新建文件为例,对比一下几个常见平台的区别。

首先看下Windows平台的代码:

// os/file.go

// 新建文件
func Create(name string) (*File, error) {
	// 跳转到下面的OpenFile
	return OpenFile(name, O_RDWR|O_CREATE|O_TRUNC, 0666)
}

// OpenFile在这里还是平台无关的代码
func OpenFile(name string, flag int, perm FileMode) (*File, error) {
	testlog.Open(name)
	// 从openFileNolog开始,不同平台代码会有不同
	f, err := openFileNolog(name, flag, perm)
	if err != nil {
		return nil, err
	}
	f.appendMode = flag&O_APPEND != 0

	return f, nil
}
// os/file_windows.go

// openFileNolog的windows实现
func openFileNolog(name string, flag int, perm FileMode) (*File, error) {
	if name == "" {
		return nil, &PathError{Op: "open", Path: name, Err: syscall.ENOENT}
	}
	path := fixLongPath(name)
	// 跳转到了syscall.Open
	r, e := syscall.Open(path, flag|syscall.O_CLOEXEC, syscallMode(perm))
	if e != nil {
		// We should return EISDIR when we are trying to open a directory with write access.
		if e == syscall.ERROR_ACCESS_DENIED && (flag&O_WRONLY != 0 || flag&O_RDWR != 0) {
			pathp, e1 := syscall.UTF16PtrFromString(path)
			if e1 == nil {
				var fa syscall.Win32FileAttributeData
				e1 = syscall.GetFileAttributesEx(pathp, syscall.GetFileExInfoStandard, (*byte)(unsafe.Pointer(&fa)))
				if e1 == nil && fa.FileAttributes&syscall.FILE_ATTRIBUTE_DIRECTORY != 0 {
					e = syscall.EISDIR
				}
			}
		}
		return nil, &PathError{Op: "open", Path: name, Err: e}
	}

	// 封装为File结构
	f, e := newFile(r, name, "file"), nil
	if e != nil {
		return nil, &PathError{Op: "open", Path: name, Err: e}
	}
	return f, nil
}
// syscall/syscall_windows.go

func Open(path string, mode int, perm uint32) (fd Handle, err error) {
	if len(path) == 0 {
		return InvalidHandle, ERROR_FILE_NOT_FOUND
	}
	pathp, err := UTF16PtrFromString(path)
	if err != nil {
		return InvalidHandle, err
	}
	var access uint32
	switch mode & (O_RDONLY | O_WRONLY | O_RDWR) {
	case O_RDONLY:
		access = GENERIC_READ
	case O_WRONLY:
		access = GENERIC_WRITE
	case O_RDWR:
		access = GENERIC_READ | GENERIC_WRITE
	}
	if mode&O_CREAT != 0 {
		access |= GENERIC_WRITE
	}
	if mode&O_APPEND != 0 {
		access &^= GENERIC_WRITE
		access |= FILE_APPEND_DATA
	}
	sharemode := uint32(FILE_SHARE_READ | FILE_SHARE_WRITE)
	var sa *SecurityAttributes
	if mode&O_CLOEXEC == 0 {
		sa = makeInheritSa()
	}
	var createmode uint32
	switch {
	case mode&(O_CREAT|O_EXCL) == (O_CREAT | O_EXCL):
		createmode = CREATE_NEW
	case mode&(O_CREAT|O_TRUNC) == (O_CREAT | O_TRUNC):
		createmode = CREATE_ALWAYS
	case mode&O_CREAT == O_CREAT:
		createmode = OPEN_ALWAYS
	case mode&O_TRUNC == O_TRUNC:
		createmode = TRUNCATE_EXISTING
	default:
		createmode = OPEN_EXISTING
	}
	var attrs uint32 = FILE_ATTRIBUTE_NORMAL
	if perm&S_IWRITE == 0 {
		attrs = FILE_ATTRIBUTE_READONLY
		if createmode == CREATE_ALWAYS {
			// We have been asked to create a read-only file.
			// If the file already exists, the semantics of
			// the Unix open system call is to preserve the
			// existing permissions. If we pass CREATE_ALWAYS
			// and FILE_ATTRIBUTE_READONLY to CreateFile,
			// and the file already exists, CreateFile will
			// change the file permissions.
			// Avoid that to preserve the Unix semantics.
			h, e := CreateFile(pathp, access, sharemode, sa, TRUNCATE_EXISTING, FILE_ATTRIBUTE_NORMAL, 0)
			switch e {
			case ERROR_FILE_NOT_FOUND, _ERROR_BAD_NETPATH, ERROR_PATH_NOT_FOUND:
				// File does not exist. These are the same
				// errors as Errno.Is checks for ErrNotExist.
				// Carry on to create the file.
			default:
				// Success or some different error.
				return h, e
			}
		}
	}
	if createmode == OPEN_EXISTING && access == GENERIC_READ {
		// Necessary for opening directory handles.
		attrs |= FILE_FLAG_BACKUP_SEMANTICS
	}
	if mode&O_SYNC != 0 {
		const _FILE_FLAG_WRITE_THROUGH = 0x80000000
		attrs |= _FILE_FLAG_WRITE_THROUGH
	}

	// 跳转CreateFile
	return CreateFile(pathp, access, sharemode, sa, createmode, attrs, 0)
}


func CreateFile(name *uint16, access uint32, mode uint32, sa *SecurityAttributes, createmode uint32, attrs uint32, templatefile int32) (handle Handle, err error) {
	// 跳转Syscall9
	r0, _, e1 := Syscall9(procCreateFileW.Addr(), 7, uintptr(unsafe.Pointer(name)), uintptr(access), uintptr(mode), uintptr(unsafe.Pointer(sa)), uintptr(createmode), uintptr(attrs), uintptr(templatefile), 0, 0)
	handle = Handle(r0)
	if handle == InvalidHandle {
		err = errnoErr(e1)
	}
	return
}
// syscall/dll_windows.go
// 封装了Syscall9
func Syscall9(trap, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9 uintptr) (r1, r2 uintptr, err Errno)

// syscall/zsyscall_windows.go
// Syscall9中传入的API名为procCreateFileW 
procCreateFileW                        = modkernel32.NewProc("CreateFileW")

// 实际上最终调用了windows API CreateFileW,下面是CPP版本的API定义
// 到这里,也可以看到,通过Syscall的定义,比较巧妙的做了一定程度上的解耦
HANDLE CreateFileW(
  [in]           LPCWSTR               lpFileName,
  [in]           DWORD                 dwDesiredAccess,
  [in]           DWORD                 dwShareMode,
  [in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  [in]           DWORD                 dwCreationDisposition,
  [in]           DWORD                 dwFlagsAndAttributes,
  [in, optional] HANDLE                hTemplateFile
);
// runtime/syscall_windows.go

// Syscall9是在这里实现的
//go:linkname syscall_Syscall9 syscall.Syscall9
//go:nosplit
func syscall_Syscall9(fn, nargs, a1, a2, a3, a4, a5, a6, a7, a8, a9 uintptr) (r1, r2, err uintptr) {
	return syscall_SyscallN(fn, a1, a2, a3, a4, a5, a6, a7, a8, a9)
}

//go:linkname syscall_SyscallN syscall.SyscallN
//go:nosplit
func syscall_SyscallN(trap uintptr, args ...uintptr) (r1, r2, err uintptr) {
	nargs := len(args)

	// asmstdcall expects it can access the first 4 arguments
	// to load them into registers.
	var tmp [4]uintptr
	switch {
	case nargs < 4:
		copy(tmp[:], args)
		args = tmp[:]
	case nargs > maxArgs:
		panic("runtime: SyscallN has too many arguments")
	}

	lockOSThread()
	defer unlockOSThread()
	c := &getg().m.syscall
	c.fn = trap
	c.n = uintptr(nargs)
	c.args = uintptr(noescape(unsafe.Pointer(&args[0])))
	cgocall(asmstdcallAddr, unsafe.Pointer(c))
	return c.r1, c.r2, c.err
}

// 最后,通过cgocall,将go的调用,转换为c的调用
// 然后回到openFileNolog中
// 在openFileNolog中,继续调用newFile,整体封装为File结构,原路返回
func newFile(h syscall.Handle, name string, kind string) *File {
	if kind == "file" {
		var m uint32
		if syscall.GetConsoleMode(h, &m) == nil {
			kind = "console"
		}
		if t, err := syscall.GetFileType(h); err == nil && t == syscall.FILE_TYPE_PIPE {
			kind = "pipe"
		}
	}

	f := &File{&file{
		pfd: poll.FD{
			Sysfd:         h,
			IsStream:      true,
			ZeroReadIsEOF: true,
		},
		name: name,
	}}
	runtime.SetFinalizer(f.file, (*file).close)

	// Ignore initialization errors.
	// Assume any problems will show up in later I/O.
	f.pfd.Init(kind, false)

	return f
}

NEOHOPE大模型发展趋势预测2405

2024年5月大模型趋势
1、规模化遇到瓶颈,资源陷阱效应愈发明显,GPT5未能按时发布,估计遇到不小的技术问题
2、垂直化趋势明显,完全通用的大模型投产比不高,而垂直化的大模型可以在一定领域内保障效果的情况下,有效降低模型训练及推理成本,
3、移动化趋势明显,以苹果为首的各厂商在努力缩减模型规模,努力提升设备推理性能,通过大模型赋能移动终端
4、具身化初现效果,无论是人形机器人,还是机器人训练,效果显著
5、多模态大模型投产低,远不如多个模态的模型整合
6、部分整合类应用已经可以赚钱,比如Perplexity等
7、下半年没有盈利能力的大模型厂商财务压力会很大
8、美国对外大模型技术封锁会更加严格

一线厂商:
1、国外闭源:ChatGPT、Gemini、Claude、Mistral
2、国外开源:Llama3
3、国内闭源:月之暗面Kimi、质谱清言ChatGLM
4、国内开源:阿里通义千问

换行符引发的惨案

最近在读go源码。

本来环境都搭建好了,源码也上传git了。
但从另一台电脑下载源码后,报了一堆神奇的错误。
最后发现是go.env文件中,回车换行是按windows系统设定上传到git的,改为linux系统设定就好了。

想起入行以来,因为字符集、换行符、正斜杠反斜杠、tab还是空格,遇到的那堆坑,唏嘘不已。
希望UTF-8早日一统天下,希望各大平台别再特立独行。
非标准化害死人,多套标准更是害死人啊。

qwen.cpp简明教程

1、下载并编译qwen.cpp

git clone --recursive https://github.com/QwenLM/qwen.cpp
cd qwen.cpp
cmake -B build
cmake -B build -DGGML_OPENBLAS=ON
cmake -B build -DGGML_CUBLAS=ON
cmake --build build -j --config Release

2、下载模型,转化为ggml格式

#从hf下载模型,下载完成后,本地地址为 ~/.cache/huggingface/hub/模型名称
#部分代码文件会有缺失,可以到hf上对比下载
from transformers import AutoModelForCausalLM
model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen-7B-Chat",trust_remote_code=True)
#模型转化为ggml格式
#同时进行量化,降低资源需求
python3 qwen_cpp/convert.py -i PATH_TO_MODEL -t q4_0 -o qwen7b-q40-ggml.bin

3、运行模型

./build/bin/main -m qwen7b-q40-ggml.bin --tiktoken PATH_TO_MODEL/qwen.tiktoken -i

chatglm.cpp简明教程

1、下载并编译chatglm.cpp

git clone --recursive https://github.com/li-plus/chatglm.cpp.git
cd chatglm.cpp
git submodule update --init --recursive
#cmake -B build
cmake -B build -DGGML_OPENBLAS=ON
#cmake -B build -DGGML_CUBLAS=ON
cmake --build build -j --config Release

2、下载模型,转化为ggml格式

#从hf下载模型,下载完成后,本地地址为 ~/.cache/huggingface/hub/模型名称
#部分代码文件会有缺失,可以到hf上对比下载
from transformers import AutoModel
model = AutoModel.from_pretrained("THUDM/chatglm-6b",trust_remote_code=True)
#模型转化为ggml格式
#同时进行量化,降低资源需求
pip install torch tabulate tqdm transformers accelerate sentencepiece
python3 chatglm_cpp/convert.py -i PATH_TO_MODEL -t q4_0 -o chatglm-6b-q40-ggml.bin

3、运行模型

./build/bin/main -m chatglm-6b-q40-ggml.bin -i

4、常见问题

#下面的错误,是transformers版本太高导致
AttributeError: 'ChatGLMTokenizer' object has no attribute 'sp_tokenizer'. Did you mean: '_tokenize'?
#需要降低transformers版本
pip uninstall transformers
pip install transformers==4.33.2

大语言模型资料汇总

一、之前整理了一些大模型的Demo,汇总如下
1、ChatGPT
https://github.com/neohope/NeoDemosChatGPT

2、Llama2
https://github.com/neohope/NeoDemosLlama2
可同步看一下中文版Llama2
https://github.com/ymcui/Chinese-LLaMA-Alpaca-2

3、阿里千问
https://github.com/neohope/NeoDemosQwen

4、清华ChatGLM
https://github.com/neohope/NeoDemosChatGLM

二、建议看一下llama.cpp
1、llama.cpp
https://github.com/ggerganov/llama.cpp

2、python的llama.cpp封装
https://github.com/abetlen/llama-cpp-python

3、千问的qwen.cpp实现
https://github.com/QwenLM/qwen.cpp

4、ChatGLM的chatglm.cpp实现
https://github.com/li-plus/chatglm.cpp

三、还有量化
https://github.com/AutoGPTQ/AutoGPTQ

四、当然还有langchain
https://github.com/langchain-ai/langchain

五、如果有余力,看一下Transformer实现
https://github.com/huggingface/transformers

llama.cpp简要教程

1、下载并编译llama.cpp

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

2、下载llama-2-7b-chat
a、可以从fb或hf下载
b、可以使用脚本下载工具,比如llama-dl
c、可以使用Chinese-LLaMA-2-7B
d、可以使用其他三方源

3、模型转换为ggml格式

python3 convert.py ../llama/llama-2-7b-chat/ 
Loading model file ../llama/llama-2-7b-chat/consolidated.00.pth
params = Params(n_vocab=32000, n_embd=4096, n_layer=32, n_ctx=2048, n_ff=11008, n_head=32, n_head_kv=32, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=None, f_rope_freq_base=None, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=None, path_model=PosixPath('../llama/llama-2-7b-chat'))
Found vocab files: {'tokenizer.model': PosixPath('../llama/tokenizer.model'), 'vocab.json': None, 'tokenizer.json': None}
Loading vocab file '../llama/tokenizer.model', type 'spm'
Vocab info: <SentencePieceVocab with 32000 base tokens and 0 added tokens>
Special vocab info: <SpecialVocab with 0 merges, special tokens unset, add special tokens unset>
tok_embeddings.weight                            -> token_embd.weight                        | BF16   | [32000, 4096]
norm.weight                                      -> output_norm.weight                       | BF16   | [4096]
output.weight                                    -> output.weight                            | BF16   | [32000, 4096]
layers.0.attention.wq.weight                     -> blk.0.attn_q.weight                      | BF16   | [4096, 4096]
...
layers.31.ffn_norm.weight                        -> blk.31.ffn_norm.weight                   | BF16   | [4096]
skipping tensor rope_freqs
Writing ../llama/llama-2-7b-chat/ggml-model-f16.gguf, format 1
Ignoring added_tokens.json since model matches vocab size without it.
gguf: This GGUF file is for Little Endian only
[  1/291] Writing tensor token_embd.weight                      | size  32000 x   4096  | type F16  | T+   3
...
[291/291] Writing tensor blk.31.ffn_norm.weight                 | size   4096           | type F32  | T+ 314
Wrote ../llama/llama-2-7b-chat/ggml-model-f16.gguf

4、模型量化,减少资源使用

./quantize ../llama/llama-2-7b-chat/ggml-model-f16.gguf  ../llama/llama-2-7b-chat/ggml-model-f16-q4_0.gguf q4_0 
main: build = 2060 (5ed26e1f)
main: built with cc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0 for x86_64-linux-gnu
main: quantizing '../llama/llama-2-7b-chat/ggml-model-f16.gguf' to '../llama/llama-2-7b-chat/ggml-model-f16-q4_0.gguf' as Q4_0
llama_model_loader: loaded meta data with 15 key-value pairs and 291 tensors from ../llama/llama-2-7b-chat/ggml-model-f16.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv   0:                       general.architecture str              = llama
llama_model_loader: - kv   1:                               general.name str              = llama
llama_model_loader: - kv   2:                       llama.context_length u32              = 2048
llama_model_loader: - kv   3:                     llama.embedding_length u32              = 4096
llama_model_loader: - kv   4:                          llama.block_count u32              = 32
llama_model_loader: - kv   5:                  llama.feed_forward_length u32              = 11008
llama_model_loader: - kv   6:                 llama.rope.dimension_count u32              = 128
llama_model_loader: - kv   7:                 llama.attention.head_count u32              = 32
llama_model_loader: - kv   8:              llama.attention.head_count_kv u32              = 32
llama_model_loader: - kv   9:     llama.attention.layer_norm_rms_epsilon f32              = 0.000001
llama_model_loader: - kv  10:                          general.file_type u32              = 1
llama_model_loader: - kv  11:                       tokenizer.ggml.model str              = llama
llama_model_loader: - kv  12:                      tokenizer.ggml.tokens arr[str,32000]   = ["<unk>", "<s>", "</s>", "<0x00>", "<...
llama_model_loader: - kv  13:                      tokenizer.ggml.scores arr[f32,32000]   = [0.000000, 0.000000, 0.000000, 0.0000...
llama_model_loader: - kv  14:                  tokenizer.ggml.token_type arr[i32,32000]   = [2, 3, 3, 6, 6, 6, 6, 6, 6, 6, 6, 6, ...
llama_model_loader: - type  f32:   65 tensors
llama_model_loader: - type  f16:  226 tensors
llama_model_quantize_internal: meta size = 740928 bytes
[   1/ 291]                    token_embd.weight - [ 4096, 32000,     1,     1], type =    f16, quantizing to q4_0 .. size =   250.00 MiB ->    70.31 MiB | hist: 0.037 0.016 0.025 0.039 0.057 0.077 0.096 0.111 0.116 0.111 0.096 0.077 0.057 0.039 0.025 0.021 
...   
[ 291/ 291]               blk.31.ffn_norm.weight - [ 4096,     1,     1,     1], type =    f32, size =    0.016 MB
llama_model_quantize_internal: model size  = 12853.02 MB
llama_model_quantize_internal: quant size  =  3647.87 MB
llama_model_quantize_internal: hist: 0.036 0.015 0.025 0.039 0.056 0.076 0.096 0.112 0.118 0.112 0.096 0.077 0.056 0.039 0.025 0.021 
main: quantize time = 323302.84 ms
main:    total time = 323302.84 ms

5、使用模型

./main -m ../llama/llama-2-7b-chat/ggml-model-f16-q4_0.gguf -n 256 --repeat_penalty 1.0 --color -ins